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President’s MESSAGE 



Roy L. Russo 


CS overview 


The Computer Society of the IEEE 
grew out of the Computer Group, an 
organization within the 100-year-old 
Institute of Electrical and Electronics 
Engineers that became concerned with 
computing some 36 years ago. The 
Computer Society has since grown into 
the world’s largest association of com¬ 
puting professionals, with a total mem¬ 
bership of approximately 90,000 computer 
scientists, computer engineers, and 
allied professionals. Society members 
are employed in industry, government, 
and academia all over the world. Soci¬ 
ety membership is open to IEEE mem¬ 
bers, associate members, and student 
members and to non-IEEE members 
who qualify for affiliate membership. 
An affiliate member is a person who 
has achieved status in his or her chosen 
field of specialization and whose 
interests focus in the computing field. 

The Computer Society’s many and 
varied programs are all directed toward 
providing services to its members and 
the profession. Every member receives 
Computer, a monthly magazine of 
general interest to computing profes¬ 
sionals which also covers society news 
and events. Five specialized magazines 
and three journals are also available to 
society members as optional subscrip¬ 
tions and to nonmembers, libraries, and 
organizations. Magazines published by 
the Computer Society include IEEE 
Computer Graphics & Applications, a 
monthly; IEEE Micro, IEEE Design & 

Test, and IEEE Software, all bimonthlies; 
and its newest magazine, IEEE Expert, a 
quarterly dedicated to intelligent sys¬ 
tems and their applications. Research- 
oriented journals include IEEE Transac¬ 
tions on Computers and IEEE Transac¬ 
tions on Software Engineering, both 
published monthly, and IE EE Transac¬ 
tions on Pattern Analysis and Machine 
Intelligence, a bimonthly. The Computer 
Society Press publishes nonperiodical 
literature, including tutorial texts and 


T he Computer Society is a large 
and dynamic organization with 
many varied activities con¬ 
ducted around the world. In this month’s 


conference records. Last year 82 new 
titles were produced, and the society’s 
catalog now contains almost 700 titles. 

The society’s programs are not 
limited to its publications. The society 
operates an extensive conference pro¬ 
gram. This year, it will sponsor or 
cosponsor more than 100 conferences, 
symposia, and workshops, with atten¬ 
dance ranging from a few dozen to 
hundreds and thousands. A major new 
conference, the Fall Joint Computer 
Conference, premiered last November at 
the Infomart facility in Dallas. Other 
major conferences have recently 
celebrated anniversaries of from 10 to 25 
years. 

Technical committees offer the 
opportunity to interact with peers in 
technical specialty areas such as soft¬ 
ware engineering, design automation, 
test technology, computer communica¬ 
tions, microprocessors and microcom¬ 
puters, VLSI design, supercomputing 
applications, and personal computing. 
These technical committees sponsor 
workshops and conferences, initiate 
standards development activities, pub¬ 
lish newsletters and bulletins, and 
generally add much to the technical 
vitality of all the society’s programs. 
Membership in a TC is open to both 
society members and qualified non¬ 
members. 

A unique aspect of the society’s 
activities and one that is consistently 
growing in importance is the develop¬ 
ment of standards. Draft standards are 
written by over 60 standards working 
groups in all areas of computer technol¬ 
ogy; after approval via vote, they become 
IEEE standards used throughout the 
industrial world. 

In addition, tutorials, educational 
activities, accreditation of computer 
science and engineering academic pro¬ 
grams, and an international electronic 
mail network all play prominent roles in 
the society’s activities. The Computer 
Society has over 100 local chapters 
throughout the world, and an additional 
100-plus student chapters. 


column, Executive Director T. Michael 
Elliott discusses the society’s origins 
and current activities. 

President Roy L. Russo 


The Computer Society’s success is 
largely due to the considerable efforts of 
many of its members who volunteer 
their time and talents to work on the 
society’s programs. Those volunteers are 
supported by a professional staff of 76 
persons—editors, accountants, regis¬ 
trars, managers, sales staff, secretaries, 
and more. The staff work in the soci¬ 
ety’s headquarters office in Washington, 
DC, in its publications office in Los 
Alamitos, California, and in its new, 
European office in Brussels, Belgium. 
The society’s annual budget is approxi¬ 
mately $15 million, but because of the 
volunteer effort involved, that figure 
substantially understates the magnitude 
of society activities when compared to 
for-profit organizations. 

The programs of the Computer Soci¬ 
ety are not designed for the casual com¬ 
puter user, but are essential to the ability 
of the serious computing professional to 
stay current in this very rapidly chang¬ 
ing technology. It’s an exciting organiza¬ 
tion serving an important function in a 
dynamic field. Its staff is proud to 
devote their professional lives to it, and 
the society should be proud of their 
dedication and professionalism. 


Executive Director T. Michael Elliott 



COMPUTER 









Workshop Announcement 



INTERNATIONAL WORKSHOP 
on PETRI NETS and 
PERFORMANCE MODELS 
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Students $ 35.00 $ 35.00 
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Center; 703 Langdon St. Madison, WI, 53706, USA 

HOTEL RESERVATIONS: Send no money at this time 
University Inn [ ] Single $39 [ ] Double $47 
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INFORMATION: For a copy of the complete workshop 
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of Illinois; Box 4348; Chicago IL 60680 USA 
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Call for Participants 

10th Data Communications Symposium 
Building the Global Network 


Napa Valley , California 
Sponsored by: 5-7 October 1987 


Association for computing machinery 
special interest group on Data 
Communications (S1GCOMM) 


THE COMPUTER SOCIETY 
Af’V'WV OF THE IEEE 

Technical Committee on l \ 

'0«' Computer Communications 


IEEE Communications Society 
Technical Committees on Data 
Communication Systems, 
Computer Communications 


Symposium Overview 

The 10th Data Communications 
Symposium will focus on how to 
extend the current generation of 
data networks into a network of 
global proportion. There are 
many fundamental problems in 
naming, routing, protection, and 
other areas that need new insight 
to make a global network feasi¬ 
ble. The symposium will be held 
as a workshop with the goal of 
bringing together a small group 
of experts in data communica¬ 
tions to discuss these and other 
issues related to building the 
global network: 

Global Network Architecture: What is 
needed in a global network that we do 
not have today? Will the global net¬ 
work be a single monolithic network 
like today’s phone system or will it be 
a collection of loosely coupled 
autonomous networks? Will long-haul 
facilities remain costly and have lower 
bandwidth when compared to local 
networks? Where are administrative 
boundaries necessary and what form 
should they take? 

Integrated End-User Services: What 
services will users actually want from 
next generation networks? Who will be 
the consumers of these services: scien¬ 
tific supercomputing, information 
publishers, financial services ? Is the 
integration of voice, data, and video 
a pipe dream ? How will ISDN and 
FDDI interact? 

Naming and Directories: What is the 
global network's ‘ 'phone book ’ ’ going 
to look like? How will a directory ser¬ 
vice that effectively scales this problem 
to size be designed? Does the whole 
name structure (from top to bottom) 
have to be designed before work on a 
directory is begun ? 


Routing: Are there stable and robust 
routing algorithms that can deal with 
global networks? In fact, what is the 
role of addressing in routing? How 
dynamic should routing be across 
separately managed regions of the net¬ 
work? Will the global network be con¬ 
nectionless or connection-oriented? 

Protection: Should the network or its 
endpoints provide authentication and 
access control? How will the network 
deal with various security models? 

Who will mediate among the multiple 
authentication domains and security 
perimeters? 

Past Experiences: What has been 
learned from past attempts to build 
community and enterprise-oriented 
networks ? How do today’s problems 
encountered by networks such as 
NSFnet and Physicsnet presage the 
problems of tomorrow? How will 
researchers and the standards com¬ 
munities cooperate to build the global 
network? Can (andmust) everything 
be standardized? 

The workshop will be organized as a 
series of informal presentations and 
moderated panel discussions. To per¬ 
mit effective interaction, the workshop 
will be limited to approximately fifty 
people. 


Registration Fee: 

Make $250 check payable to 10th Data 
Communications Symposium 

Mail three copies of proposal, registration 
form, and registration fee by June 15, 1987. 


Program Committee 

David Clark, Co-Chair, Massachusetts 

Institute of Technology 

David Oran, Co-Chair, Digital 

Equipment Corporation 

Vinton Cerfi Corporation for National 

Research Initiatives 

David Mills, University of Delaware 

Roger Needham, Cambridge University 

Stephen Langdon, Amdahl Corporation 

Participant Instructions 

To attend, submit a 500 word (two 
page) written summary of your rele¬ 
vant work in this area, identifying the 
contribution that you are prepared to 
make at the workshop. 

Send three copies of the proposal, a 
completed registration form, and a 
(refundable) $250 registration fee by 
June 15, 1987. 

Attendees will be selected by the Pro¬ 
gram Committee on the basis of their 
written proposal. Notification of accept¬ 
ance and rejection will be mailed by 
August 1, 1987. Accepted particiants 
will receive further information on the 
workshop and their participation. All 
other applicants will have their 
registration fee promptly returned. 


Send Registration Material To: 

10th Data Communications Symposium 
c/o David Clark 

MIT Laboratory for Computer Science 
545 Technology Square 
Cambridge, MA 02139 
(617)253-6003 


Name ___ 

Address ___ 

Affiliation _ Telephone _ 

There will be no on-site or late registration for this workshop. 


10th Data Communications Symposium Registration Form 
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W* Monte Carlo and 

molecular dynamics 
methods, combined 
with code optimized 
f for vector and parallel 
machines, are power¬ 
ful tools for investigat¬ 
ing problems in 
physics and 
chemistry. 


Anders Wallqvist and Bruce J. Berne, Columbia University 
Chani Pangali, Amdahl Corporation 


N : 


r ow that parallel vector proces¬ 
sors are replacing scalar proces- 

_ sors as a computational tool, 

scientists must reexamine the efficiency of 
existing code. The algorithms used and the 
code developed on scalar machines should 
be replaced by codes optimized for vector 
and parallel machines. New codes adapted 
to parallel computing should be written for 
a whole range of problems in chemistry 
and physics. 

Because most scientific application pro¬ 
grams are — and will probably continue 
to be — written in Fortran, 
we will have to 


rely on Fortran as a programming lan¬ 
guage for supercomputers. Extensions of 
Fortran 77, usually referred to as Fortran 
8X, are being discussed. These changes — 
for example, operations on whole vectors 
expressed in one statement — would incor¬ 
porate instructions tailored to machines 
with parallel architectures. However, we 
are still a long way from setting and imple¬ 
menting a standard for general use. In 
fact, the proliferation of different vector 
processors and architectures could create 
a jungle of new Fortran hybrids, 
a highly unsatisfactory 
situation. 
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Exploiting Physical Parallelism 
Using Supercomputers 


Two Examples from 
Chemical Physics 






Figure 1. A many-particle system can be considered as a set of naturally occurring 
neighborhoods, which can be handled by vector processing. (A true multiprocessing sys¬ 
tem would dedicate one processor per particle.) These recurring neighborhoods can be 
mapped onto many physical problems in liquid-state theory. 


Ideally, we would like to minimize the 
reprogramming effort and rely on com¬ 
pilers to achieve maximum efficiency on 
the supercomputer of choice. However, 
we are far from this ideal state. Not even 
the most sophisticated compiler can do 
what the programmer can with some basic 
insights into the parallelism of a given 
problem. Still, the importance of a good 
compiler should not be ignored. We need 
to have ideas about code structure trans¬ 
lated into efficient machine code with a 
minimum of reprogramming. 

We have used a simple approach to 
achieve an efficient program structure for 
investigating two important problems in 
chemical physics. Only this blending of 
supercomputers and a new approach to 
programming has enabled us to study 
these complex problems of aqueous solu¬ 
tions. Using the software tools of the 
Amdahl 1200 system, we were able to vec¬ 
torize our codes with standard Fortran. 

Parallel nature 
of the problem 

Not all problems or computer codes 
lend themselves to parallel computing. 
This limitation of vector processing tends 


to be overlooked when dealing in a general 
program environment. Most application 
computing is done with a wide variety of 
programs of the most diverse nature, the 
majority of which will probably not be eas¬ 
ily vectorized. If this is true, no amount of 
planning or reprogramming will circum¬ 
vent the necessity of doing a large portion 
of the processing at scalar speeds. Having 
a well-balanced machine with both a fast 
vector processor and a fast scalar unit is 
still of importance. 

To determine the applicability of vector 
processing to a given computer code, one 
needs to look at the global aspects of the 
problem. The more physically parallel a 
problem is, the more potential it has for 
successful solution on a vector processor. 
One identifies the tasks that can be done 
simultaneously and the ones that may be 
present scalar bottlenecks and gets a 
qualitative feel for the nature of the prob¬ 
lem. Then one determines whether the 
problem can be restructured to look like a 
parallel problem. 

Looking at a sample of a liquid, one 
could take the extreme view of having each 
constituent molecule managed by a dedi¬ 
cated processor. This would lead to con¬ 
sideration of very distributive multi¬ 
processor systems. A coarser-grain view, 


in terms of small neighborhoods as shown 
in Figure 1, lends itself more readily to 
single-processor vector computation. A 
neighborhood can be thought of loosely as 
a set of computations needed locally but 
replicated throughout the system. Prob¬ 
lems that break this symmetry will be 
harder to process efficiently on a parallel 
computer. 

A neighborhood might not always be 
thought of as a physical entity; it could 
also be a conceptual idea of quantities that 
vary slowly over the calculation. In that 
case, it might be advantageous to break 
these slowly varying quantities out of the 
calculation and recalculate only the rapidly 
varying quantities frequently. 

Once the general nature of parallelism 
has been determined, one can start work¬ 
ing on the computer to implement lower 
level parallelism into the code. Here, we 
present two problems in liquid-state the¬ 
ory using two common techniques, Monte 
Carlo and molecular dynamics, and cover 
the general structure of the vectorization 
process. (The accompanying sidebars pro¬ 
vide slightly more detailed explanations.) 

The solvated electron 

The structure of the solvated electron is 
of considerable theoretical and experimen¬ 
tal interest. 1,2 An excess electron in a fluid 
may be free, quasifree, weakly localized, 
or strongly localized depending on the 
nature of the solvent, for example, its den¬ 
sity and temperature. The nature of elec¬ 
tron states in a wide variety of media has 
been investigated both experimentally and 
theoretically. Many of the models that 
have been used to predict the structural 
and dynamic properties of the electron in 
polar solvents have been purely 
phenomenological in nature. Ignorance of 
the effect of short- and long-range forces 
felt by an electron in a fluid environment 
has forced previous investigators to use 
relatively crude models. 

Our calculations, based on detailed 
model potentials and path-integral Monte 
Carlo methods, investigate the nature of 
the solvated electron. These methods allow 
us to statistically sample, from a model 
system, real-world phenomena. Statistical 
mechanics assures us that microscopic 
properties calculated from a model system 
have a macroscopic thermodynamic 
meaning. The computer experiments will 
allow us to examine in detail the 
equilibrium properties of the system in 
terms of structure and energetics. By using 
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Monte Carlo 
techniques 
in the study 
of liquids 



The facade of the famous Le Casino in Monte Carlo, Monaco. This small country hosts 
one of the world’s biggest gambling resorts. (Photo: B. Van Berg © Image Bank West) 


The goal in the study of liquids is 
to understand their detailed behavior 
at the molecular level. We do this via 
a model system that generates all 
relevant positions of the atoms and 
molecules in the liquid system. This 
lets us view the local structure 
around single molecules, see how 
different molecules arrange them¬ 
selves around a solute, and look at 
properties not readily available in 
wet experiments. Of course, the 
model system must agree with any 
macroscopic experimentally deter¬ 
mined property. 

The term Monte Carlo techniques' 
referred originally to the numerical 
evaluation of multidimensional 
integrals too hard to do analytically. 
In statistical mechanics, 2,3 Monte 
Carlo methods are used to calculate 
averages of molecular properties 
expressed as multidimensional 
integrals. 

The approach in calculating ther¬ 
modynamic properties of a liquid is 
to assume a certain interaction 
between atoms — for example, two 


positive ions will repel each other 
according to a coulombic potential. 
At the level of approximation in 
which we are interested, all quantum 
effects are ignored. This is a good 
assumption at most temperatures 
where the atoms are so far apart on 
a molecular scale that they will not 
be able to feel each other’s core elec¬ 
trons. Monte Carlo techniques are 
then used to generate different posi¬ 
tions of the model system such that 
they contribute significantly to the 
average. 

Any physical assembly of 
molecules and atoms can be studied 
with this technique. Its use has led to 
greater understanding of such 
phenomena as solvation properties 
of molecules in different solvents, 
the hydrophobic effect (the apparent 
attraction between non-interacting 
molecules in a water solution), and 
the dynamics of larger aggregates 
such as membranes and proteins in a 
water solution. 

Other problems that can be 
mapped onto this kind of an 


approach can also be studied fruit¬ 
fully using Monte Carlo methods. 
An interesting application is place¬ 
ment of VLSI circuit components 
using a Monte Carlo method. 4 
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simulations we will have a complete record 
of generated molecular configurations 
from which to investigate physical and 
thermodynamical quantities. 

A parallel approach. To exploit the vec¬ 
tor nature of the problem, it is necessary to 
first consider the global aspect of the simu¬ 
lation. In an equilibrated liquid sample, the 
structure of the nearest neighbors of each 
molecule will not change drastically over 
a given small set of configurations gener¬ 
ated by the Monte Carlo walk. Then we 
will have clearly defined subregions of 
space where the number of molecules stays 
the same. Thus, in the physical nature of 
the problem, we already have the founda¬ 
tion for creating a highly vectorized 
program. 

The slow motion of the molecules per¬ 
mits setting up vectors tabulating all 
molecular interactions of each subregion. 3 
These vectors are periodically updated to 
govern exactly which interactions are to be 
included in each calculation needed to 


propagate the system. To be sure to include 
all relevant interactions, one creates a 
sphere of influence, tabulating a few addi¬ 
tional interactions of particles that might 
wander into the subregions between 
updates, and later suitably corrects for this 
in the actual evaluation of the energy. 

The next level of parallelism to inves¬ 
tigate is within the actual Monte Carlo pass 
where, in fact, the algorithm does not 
exhibit an inherent vector character. Con¬ 
sider a ‘ ‘typical” pass where N is the num¬ 
ber of water molecules in the system: 
do i = \,N 

make a trial move of the rth water 
evaluate the energy difference, 

Af:, between configurations 
if e ~^ kT < random number 

between 0 and 1 then 
accept new configuration 
else 

reject new configuration 
endif 
enddo 


Since each loop is dependent on all the 
previous configurations, a serious recursive 
problem arises, which is not encountered 
in molecular dynamics (see next section). 
Thus, for more complicated systems than 
Lennard-Jones type particles, the main 
effort of the code restructuring should be 
within the energy calculation and in the 
maintenance of the bookkeeping vectors 
governing which interactions should be 
included. The bulk of the CPU time is 
spent on the calculation of the water-water 
interactions because, after each move of 
the water molecule, the pair energy of all 
surrounding molecules changes. The 
global parallelism is used here to precollect 
all interacting pairs and pack them into 
consecutively filled vectors for processing 
when calculating the interaction energies 
(see sidebar on facing page). 

The code was developed using software 
tools available for the Amdahl 1200. 4 
Their main advantage lies in the interactive 
vectorizer, which provides immediate feed- 
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Figure 2. Code comparison by tasks shows a 20-times performance speedup achieved by going from scalar to vector processing. The 
bars labeled water 1 and 2 refer to Monte Carlo moves involving only the water molecules, and the electron 1 and 2 bars refer to the 
electrons. Note that relatively little time is being spent in controlling the indices, gathering vectors, etc., as opposed to doing the actual 
Monte Carlo walk with energy evaluations. 
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back on the vectorization effort. Since no 
special machine calls were employed, we 
used standard Fortran 77, thus permitting 
easy portability of the code. All 
gather/scatter type operations were han¬ 
dled automatically by the compiler. Very 
few compiler directives were used, and 
those only to force vectorization of certain 
loops that contained ambiguous (to the 
compiler) recursive relations. To avoid 
introducing non-Fortran statements in the 


actual code, these machine instructions are 
prefixed as comment lines; they are, how¬ 
ever, transparent to the vector compiler. 

A performance evaluation of the code in 
scalar mode shows that most of the time is 
spent calculating water-water energies after 
each trial move in a pass (see Figure 2). 
Successful vectorization substantially 
reduces the overall time and redistributes 
the time spent in each task more evenly. 

Thus, without any special knowledge of 


the Amdahl 1200’s architecture and with a 
minimum amount of effort, we effectively 
used the machine’s vectorization feature. 

For a typical set of parameters, we 
achieved a megaflop rate of 150 out of a 
possible 530. This result, we think, is 
encouraging for our fairly complex 
6000-line code. The relative speeds 
achieved by some other computers on 
which we’ve tried this code are shown in 
Figure 3. 


Design of a vectorized loop in a typical MC pass 


The most time-consuming part of the water/electron 
program is the constant reevaluation of interaction ener¬ 
gies. Given below are some of the practical strategies we 
employed, using only Fortran 77, to vectorize parts of our 
code. Efficient scalar optimization was achieved before 
any restructuring of the code. Speedup in scalar mode was 
only minimal. 

It is assumed that a sphere of influence, tabulating all 
molecular interactions and giving the potential cutoff, has 
been constructed around each particular molecule. The 
main objective of the vectorization was to achieve large 
vector lengths that would efficiently utilize the Amdahl 
1200’s vector unit. A shorthand notation is used in giving 
the general outline of the Fortran code. 

Scalar version of water-water interactions. The loop to cal¬ 
culate the interactions between water molecules is typically 
the most time-consuming part of the program. In the trial 
move of molecule i, we need to calculate the potential 
energy of all molecules interacting with this molecule. The 
indices of these interacting molecules, ij, forming the sets 
book(i) of those y’s interacting with a given are calculated 
and stored in a vector at frequent intervals. Because the 
system is not static, one includes extra molecules outside 
the cutoff radius R cm to ensure examination of all possible 
interacting pairs in the energy evaluation. Thus, an If state¬ 
ment, correcting for those extra molecules originally in the 
sphere of influence but not within the cutoff radius, must 
be incorporated in the loop. The probability of this If 
statement being true — that is, that this molecular pair is 
within the sphere of influence and should be included in 
the energy calculation — is set to be larger than 90 percent 
by regulating how many extra molecules are included. This 
If statement breaks a large part of the program out of the 
book(i) loop. The scalar optimization in this case rests 
mainly on evaluating K(|r|) most efficiently, usually using 
some lookup scheme in r “ 2 space. Schematically this can 
be written as 

do loop over number of interacting pairs, 

i,j for given i where j G book(i) 


find distance, d 2 CM 

if(rfcut < R cut) then 

do loop over all atoms on ith water 
do loop over all atoms ony'th water, j € book(i) 
find V(\ riJ \) 

endloop 

enddo 

endif 

sum up total new and old energy 

enddo 

In this case, the only statement that vectorizes is the 
potential energy evaluation, indicated in boldface. How¬ 
ever, the vector lengths involved over the atoms of the 
water molecule pair are so short that evaluating K(|r|) in 
scalar mode might be more efficient. 

Vector version of water-water interactions. In this case, 
we are trying to increase the vector lengths in the potential 
energy evaluation by pulling it outside the If statement. All 
statements, including the If statement, can then be moved 
inside the double loop, which then vectorizes. The amount 
of CPU time lost in calculating 10 percent more interac¬ 
tions is well compensated for by the vectorization of the 
statements, by a factor of approximately 20. The following 
outline makes this clearer. Boldface indicates an operation 
on a vector for dimension /, where / is the total number of 
interacting pairs. 

find cutoff distance, d\ M 

do loop over all atoms on ith water 
do loop over all atoms ony'th water, j € book(i ) 
find r\ 
find V u {r 2 j) 

if (rf < R mt ) then sum V# 
endloop 

endloop 

sum up total new and old energy 
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Figure 3. Relative speedup for some of the computers on which we have tried this Monte Carlo code. Each machine is not necessarily 
running the most optimal code version of our program for this application. In Monte Carlo simulations it is often not necessary to use 
64-bit precision in the calculations. The scale is such that the VAX 11/780 performance corresponds to 1 unit. 



Figure 4. A view of the six nearest water molecules to the electron with the pair correlation function from the electron center of mass 
to the electron polymer beads and the oxygen atoms on the water molecule. The oxygen atoms are depicted in red, hydrogen in white, 
and the electron chain is drawn as a string of green beads. The cavity is seen not to have a strong coordination of the surrounding 
water molecules. The waters closest to the electron have their hydrogen bonding reduced. 
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Simulation. The simulations were carried 
out over a whole range of physical and 
simulation parameters, such as variations 
in temperature and system size. Approxi¬ 
mately 10K passes, equaling approximately 
7.5 million different configurations, were 
run for each simulation after equilibration. 
Configurations were dumped to mass stor¬ 


age at set intervals for later analysis. The 
total simulation took approximately 12 
CPU hours. 

After extensive equilibration the electron 
clearly digs itself a cavity in the fluid. Fig¬ 
ure 4 shows a snapshot from the simula¬ 
tion, including only the six nearest water 
molecules and the surrounding water oxy¬ 


gens. (For a more detailed description of 
the results, see Wallqvist. 5 ) 

Pure water droplets 

Despite its biological importance in 
transport processes, the interface between 


Computational methods 

The Monte Carlo (MC) and molecular dynamics (MD) 
methods present powerful tools for investigating a whole 
range of problems in chemistry and physics: solvation theory, 
biomolecular systems, reaction rate theory, quantum field 
| theory, etc. These techniques owe their rapid growth and 
widespread applicability to the introduction of computers in 
the late fifties and early sixties. Today they are the instru¬ 
ments of almost every theoretical physical scientist. 

Monte Carlo. The MC method is basically an integration 
method for evaluating tough integrals. The integral we want to 
do is a multidimensional integral over all possible degrees of 
freedom of a small sample of water molecules. As the degrees 
of freedom are typically in excess of 2000, any analytical 
method is out of the question. 

In a typical implementation the MC method consists of an 
algorithm for generating a random walk in configuration 
space such that the set of configurations generated are dis¬ 
tributed according to the equilibrium Boltzman distribution 
function p(x ). 1 The equilibrium average of a function A can 
| be expressed in terms of p(x), 

(A) = j dxp(x)A(x). 

Thus, if we sample Nx’s, or realizations of the system, with 
probability p(x), to form the sum, 

then, 

^lim An = (A). 

A simple way to generate a random walk with limiting proba¬ 
bility p(x) was given by Metropolis et al. 2 in 1953 using the 
properties of Markov processes. In this scheme the probability 
for accepting a new configuration in the walk, or generation 
of configurations, is given by 

W Xi ^ Xi+l = 

where A E is the change in potential energy for going from 
configuration x, to Xi+u k is Boltzmann’s constant, and T is 


the temperature of the system. An algorithm that does this is 
then given by the following steps: 

(1) Select a particle to be moved, x t -* x i+ !. 

(2) Compute the energy difference, AE. 

(3) Calculate the acceptance probability, W xrXj+ x 

(4) Generate a random number, |, between zero and unity. 

(5) If W < |, accept the move, otherwise don’t. 

(6) Calculate and accumulate any physical property. 

The system is propagated by changing the positions of the 
atoms or molecules one at a time. One cycle through all parti¬ 
cles is termed a “pass.” The moves or step sizes are usually 
adjusted to give an acceptance probability of about 30 percent 
to ensure rapid convergence of the walk. A multiparticle step 
with a fixed step size would theoretically be possible but 
would have such a low probability of acceptance that it is not 
practically feasible. A multiparticle step with a smail step size 
would get accepted often but would not move enough to 
assure convergence. 

Looking at the diffusion in the system or at the correlation 
of subsequent configurations provides some guidance on how 
efficiently the algorithm is sampling the space of all possible 
configurations. Looking for ways to better sample configura¬ 
tion space is an important part of methodological research, 
and this search has led to improved algorithms like umbrella 
sampling, 2 force-bias Monte Carlo, 3 etc. Thus, with knowl¬ 
edge of the interaction potentials between the particles, one 
can calculate equilibrium thermodynamic quantities of the 
system. 

Molecular dynamics. Molecular dynamics is another inte¬ 
gration technique used to evaluate averages of physical quanti¬ 
ties. 4 It is based on the assumption that a time average of A is 
equal to an ensemble average of A, (A), 

T 

lim i f dtA(t) = {A) = trace{e~ pH A}/Q 
T—**x> T J 

where H is the Hamiltonian of the system containing the 
kinetic and potential energy for N particles of mass m and 
positions q N , interacting through the pairwise additive poten¬ 
tial V, 

h = l J2 miv < + E v a = T +(«i w ). 

and Q is the classical partition function for the system, 
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water and other media is relatively 
unknown. We are interested in the struc¬ 
tural properties of the simplest interface, 
that of a vacuum, as a model and stepping 
stone for investigating more complicated 
systems. 

Our previous investigations 6 have 
focused on liquid water systems and 


smaller clusters (N < ■ 10) of water. Cur¬ 
rently, we are interested in the properties of 
small droplets (N < 1000) of water in a 
vacuum. Our interest in these system lies 
primarily in the surface properties and 
behavior of water molecules at the inter¬ 
face. Dielectric behavior and infrared spec¬ 
tra are also readily calculated and analyzed 


from the generated configurations. Due to 
the nature of the system, room- 
temperature water molecules, the system 
can be treated classically. 6 In this case, 
where we are interested in time-dependent 
phenomena, we use the molecular 
dynamics technique to calculate the ther¬ 
modynamic properties of our system. 


Q(P,V) = J dxi j Dx(r)e- 
S(x(t)) = J drH(x( r)) 


Q = Je-M^dq". 

The underlying assumption here is the ergodic hypothesis, 
which states that M observations made on a single system at 
uncorrelated times have the same distribution as M observa¬ 
tions made (at the same time) on M independent ensembles in 
the limit M -» °°. Then, the average of any quantity A can be 
written as M 

< A > = 9(*»), *») 

where the position q and momenta p, as a function of time t, 
are given by <2 

?(<) =qo + vot~ a o--+-, 

»(<) = v 0 - a 0 t + ..., 

where q 0 and v 0 denote initial positions and velocities and « 0 is 
the initial acceleration. These equations are then propagated 
by an integrator. Often a crude integrator of first order is 
quite enough for liquid state simulations where cancellations 
of errors are to be expected. The simulation is performed by 
initially assigning momenta to each particle from a Boltz¬ 
mann distribution, constrained to give the desired temperature 
7 by 

1 2 _ 3NkT 

2 ^ m ’ V ‘ ~ 2 ’ 

and propagating the system in time by solving Newton’s equa¬ 
tions of motion 

F, = m,a„ i — 1, N 

for each of the atoms in the system. The forces between parti¬ 
cle i and j are given by the negative gradient of the interaction 
potential 

F . = -Vl>y. 

By choosing a time-independent Hamiltonian, dH/dt = 0, 
the trajectories have constant energy. This is a good check of 
the coded algorithm. The time step is chosen to ensure that 
the total energy does not vary by more than a given tolerance, 
usually on the order of a few tenths of a percent of the kinetic 
energy. 

Path integral methods. Feynman’s path integral formula¬ 
tion of quantum statistical mechanics 5 makes it possible to 
simulate quantum many-body systems of physical interest. 
According to this formulation, the partition function is 


is the action corresponding to the path x(y) in imaginary 
time r ; H(x(t)) is the path dependence of the Hamiltonian; 
l £>x(T)(....)represents an integration over all paths starting at 
*(0)= x,and ending at x(J)h)=x r In the discretized path repre¬ 
sentation the imaginary time r is approximated by straight- 
line paths between neighboring imaginary times. This allows 
Q(P,V) for a one-dimensional particle moving in an external 
potential V to be expressed as, 

Qr ~ 


2>, - *,+,)> + i Y_ V(„). 

Since Q P is equivalent to the classical configurational parti¬ 
tion function of P classical particles with potential <t> P , the 
quantum system is said to be isomorphic to a classical P parti¬ 
cle cyclic chain polymer in which each particle t interacts with 
its neighbors t— 1 and t +1 through a harmonic potential with 
force constant mP/ip 2 h 2 , and each particle experiences the 
reduced potential V/P (see top figure on facing page). Clearly, 
Q P is an approximation to the true Q. It is easy to prove that 
Qp> Q and Q = lim p^ a Q P . In path-integral simulations, one 
empirically determines and uses that P beyond which there is 
no detectable change in thermodynamic properties. 

The isomorphic classical system can be simulated using 
either Monte Carlo or molecular dynamics techniques. 6 This 
is the basis for the rapid growth of path-integral simulations 
of quantum and mixed classical and quantum systems in 
recent years. Detailed information about systems where quan¬ 
tum effects are important, like tunneling phenomena, are now 
becoming available. 

Simulations. The simulation of a physical system is carried 
out on the computer using either MC or MD methods to sam- 


where x, = x(t fth/P), x P+l = x h and 
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Molecular dynamics is a technique simi¬ 
lar to Monte Carlo in that it allows us to 
calculate macroscopic quantities from a 
small, statistically sampled microscopic 
system. The actual simulation is done by 
assigning positions and velocities to a small 
set of molecules and then propagating 
them in time using Newton’s equations. 


The only knowledge required for this 
approach is the potential energy functions 
of the atoms and molecules in the system. 
(A more in-depth description of molecu¬ 
lar dynamics and simulations in general is 
given in the special section, “Computa¬ 
tional Methods, ’ ’ below and on the preced¬ 
ing pages.) 


A parallel approach. This example 
exhibits the same type of global parallelism 
as the preceding path-integral Monte Carlo 
simulation — that is, the system behaves 
conservatively and is restricted to a fairly 
well confined region of space. Thus, we can 
think of the problem in terms of subregions 
and predict that it will be eminently vec- 


tk 

p = 6 


Isomorphic polymer with P beads interacts with nearest- 
neighbor harmonic forces. Each bead feels an external poten¬ 
tial V(r)/P. A quantum particle with discretization of P = 6 
interacting with two classical particles is shown. 

o— 1 —o- 

A B 

k) 

A' 


Periodic boundary conditions are imposed to minimize size 
effects. A particle A interacts with particle B via its 
minimum-distance image A This necessitates an interaction 
cutoff smaller than or equal to L/2, half the box length. 


pie an equilibrium distribution of N (usually several hundred) 
interacting particles in a box of length L. Ideally, one would 
like to simulate real systems with macroscopic quantities of 
material, that is, O(10 23 ) molecules. However, because ther¬ 
modynamic properties can usually be calculated with far 
fewer particles and because molecular interactions are usually 
short-ranged, there is still room for computer simulations. 
Periodic boundary conditions (see figure above) are imposed 
so as to minimize the size effect of the small number of 
molecules involved. The particles interact with each other via 
a continuous two-body interaction potential V cut off at some 
distance less than or equal to half the box length. The origin 
of the potential is either from a fit made to certain charac¬ 
teristics of the system, such as the second viral coefficient, or 


from an ab initio calculation of the two-body interaction 
energy. A form is chosen for the potential, usually incorporat¬ 
ing coulombic forces, polarization forces, short-range repul¬ 
sions, etc.; that is, one tries to make the potential as realistic 
as possible while maintaining a simple functional form. 

A system that has been extensively investigated by both MC 
and MD simulations is the Lennard-Jones system. Lennard- 
Jones particles are characterized by a weak attractive van der 
Waals force and a short-range hard-core repulsion. The poten¬ 
tial is given by the form 

V(r)=4 £ ((^) 12 -(") 6 ), 

where t is the strength of the interaction, o gives the size of the 
atom, and r is the radial distance between the two interacting 
particles. It has been found that a whole range of force fields 
in biomolecular modeling 7 can be parameterized using this 
form for the potential. 
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torizable. The difference from the solvated 
electron problem lies in the algorithm used 
for generating configurations. In this case 
there is also an inherent parallelism in the 
actual time step loop. Consider the loop 
over all interacting pairs (/Vis the number 
of water molecules in the system): 

do i = 1, 7Vh 2 o - 1 
do j = i + 1, /Vh 2 o 
find energy (optional) 
find force between pair 
enddo 
enddo 

Unlike the Monte Carlo algorithm, the 
double loop over all pairs in this system 


does not depend on any previous quantity 
in the loop. Any control statement specify¬ 
ing which pairs to include in the force cal¬ 
culation can be moved outside the double 
loop. Then this loop will contain only the 
time-consuming task of calculating the 
forces. These are most conveniently found 
using some type of table lookup scheme. In 
this example, most of the human effort was 
spent in recoding the section of the pro¬ 
gram containing this double loop. 

It is desirable to remove the double loop 
structure altogether as the innermost loop 
has a variable vector length associated with 
it from 1 to /Vh 2 o. making it hard to effi¬ 
ciently use the vector pipelines at the lower 
vector dimensions. To circumvent this 


problem, we can gather all the proper inter¬ 
acting pairs (given the actual cutoff used 
in the simulation) into one long vector of 
length /Vh 2 o(/V'h 2 o - 1) / 2, where /Vh 20 
is the total number of water molecules and 
/V'h 2 o is the average number of interacting 
pairs for each molecule. With this scheme 
one can achieve both a single loop and an 
increased vector length to be processed in 
the loop. Thus, the entire force loop can be 
vectorized with maximum efficiency (see 
box below). 

Another scheme, which we have not 
implemented yet, would be to analyze the 
forces involved so as to differentiate 
between “slow” and “fast” forces. 7 A 
slow force would be one that did not decay 


Design of a vectorized loop 

for the force evaluation in an MD simulation 


In any given molecular dynamics program, the force 
evaluation easily accounts for more than 95 percent of the 
CPU time. Nothing inherently prohibits vectorization of 
the entire force evaluation in a time iteration. The poten¬ 
tial we seek is smoothly switched off at large distances 
such that the energy between a pair of water molecules is 

giVenby H 2 0 atoms 

U = ^ u.j ■ switch (r.), 

This also requires the energy to be calculated each time 
step. The force is now given by 



= ^switch + t 


dswitch(r tj ) 

' d r ij 


Scalar version of the water-water force evaluation. In 

the original scalar code, the force is evaluated over a dou¬ 
ble loop of all interacting pairs. The complication of cal¬ 
culating the switching function does not add much to the 
force evaluation. Where Nh 2 o is the number of water 
molecules and R cul is the given radial cutoff, we can write 
this loop as 
do i = 1, /Vh 2 o - 1 
doy = i + 1, /Vh 2 o 

find cutoff distance, d c U , 
if (d \ ut < R\ m) then 
do loop over all atoms on ith water 
do loop over all atoms onyth water 
find ty 

find switch(rjj ) and d / dr ,y switch^ ) 
find U(\4\) and d / dr u C/(|rg|) 


sum up the correct force for this pair 
endloop 
endloop 
endif 
endloop 
endloop 

Vector version of the water-water force evaluation. In 

the vector version, we try to move as much of the decision¬ 
making process as possible out of the loop. At regular 
intervals we set up an interaction vector for the sphere of 
influence, which contains the indices of all interacting 
pairs, given the cutoff and some additional interactions 
that might possibly wander inside the cutoff sphere during 
the given update interval. Before entering the new force 
loop, we calculate all switches and derivatives. The cutoff 
radius is treated as another switch, given the set of interact¬ 
ing pairs. The force loop is now converted to a loop over 
all interacting pairs. The double loop over the atoms on 
each water molecule is expanded. This scheme is outlined 
below; boldface type indicates an operation on an entire 
vector of dimension /, where l is the total number of inter¬ 
acting pairs. 

find all distances, r, given an interaction vector 

find switch(r 2 ) and d / dr switch (r 2 ) 

find all U and d / dr U 

find all force pairs, F :j (r, U, switch) 

scatter all force pairs according to an interaction vector 

All indirect addressing is vectorized automatically on the 
Amdahl 1200 resulting in a completely vectorized force 
loop. The main drawback with this scheme is the increased 
memory requirement on the machine. However, for a 
moderate system with /Vh 2 o < 1000 and a small cutoff, this 
is not prohibitive. 
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Civic Plaza — Cmvention Center, Phoenix, Arizona 
JUNE 8-12,1987 


TECHNICAL SESSIONS 

Exploring issues of interest to the UNIX® technical 
community, these sessions provide a forum for 
presentation and discussion on a variety of topics. Topic 
areas include, but are not limited to: 

■ Kernel enhancements, measurements, etc. 

■ Programming languages and environments 

■ UNIX in the office environment 

■ Standards and portability 

■ New mail systems (e.g. X.400-based systems or user 
interfaces incorporating new interface paradigms) 

■ Applications, especially unusual ones such as computer 
aided music, factory automation, etc. 

■ Security 

■ UNIX vs. the naive user 

■ Workstations: comparisons, experiences, trends 

■ Beyond UNIX—what next? 

TUTORIALS 


VENDOR EXHIBITION 

The USENIX Association’s annual Vendor Exhibition will 
be held June 9-11 at the Phoenix Civic Plaza. The pri¬ 
mary intent of this Exhibition is to provide vendors an 
opportunity to display advanced technology in hardware 
and software innovations relevant to the UNIX technical 
community. 

A multi-vendor networking capability will allow commu¬ 
nication between participating vendors. State of the art 
networking features will be demonstrated. 

THE SPONSOR 

The USENIX Association is an international technical and 
professional organization devoted to fostering innovation 
with a historical and present UNIX bias. It promotes the 
exportation and importation of ideas, encourages research 
that works and problem-solving with a practical bias. 

Plan now to attend the 1987 Summer USENIX 
Conference and Exhibition for the latest in UNIX 
applications and research. 



The USENIX Association will offer its well respected, in- 
depth tutorial program on June 8 and June 9- Presented 
by leading experts, the tutorials provide detailed 
examination of several areas of UNIX technology. Topics 
include: 


■ 4.xBSD and System V 
Internals 

■ Networking 
Developments 

■ Windowing Schemes 


■ UNK System 
Implementations 

■ Software Development 
Tools 

■ Graphics 


■ Intermediate and 
Advanced UNIX and 
G Programming 

■ Administration 
of Systems 
and Networks 


For complete conference details, 


The Professional 
and Technical 
UNIX Association 

call: (213) 592-1381 or 
(213) 592-3243 
or write: 

USENK Conference Office, 
P.0. Box 385, 

Sunset Beach, CA 90742 


























Figure 5. Coordinate projection of an N = 216 water cluster and density profile of the water vacuum interface. The water cluster is 
roughly spherical with offshoots of smaller assemblies of water molecules. Some of the tetrahedrical structure of the water hydrogen 
bond network can be seen. The water-vacuum interface is on the average rather smooth. 


over an appreciable time in the simulation. 
If one could separate these forces from the 
force calculation and evaluate them only 
periodically, one could drastically reduce 
the interaction calculations needed to 
propagate one time step. 

For a typical set of parameters with 216 
water molecules, we achieved a megaflop 
rate of 350 out of the possible 530. Again, 
this is promising when viewed in light of 
the complexity of the program, approxi¬ 
mately 3000 lines of code, and the relatively 
standard programming techniques 
employed. In this case, we were able to cal¬ 
culate approximately 250,000 time steps 
per CPU hour. 

Simulations. Preliminary simulations 
have been carried out using some smaller 
clusters, N 0(100). The integration time 
step was set to 2.5xl0“ 16 seconds, making 
real-time simulations of 6.3x10“" 
seconds, or 0.06 nanoseconds per CPU 
hour, possible. The calculations so far 
show that the cluster stays together as a 
well-defined cluster entity over the simu¬ 
lation length. Figure 5 shows a coordinate 
projection of an N = 216 cluster with the 
interface profile given as an oxygen pair 
correlation function from the center of the 
cluster. 


O ur experience shows that a fast 
vector processor can oe used 
effectively to simulate complex 
physical systems. Once the basic structure 
of the program is complete, it can be used 
on different machines with only minor 
adjustments, such as inserting the proper 
machine-dependent instructions to vector¬ 
ize certain types of scalar loops. 

The importance of having vectorized 
code available for MD and MC simula¬ 
tions should be clear, since supercomputer 
facilities are becoming more and more 
accessible. The main advantage of being 
able to generate vectorized code using only 
standard Fortran is that it will allow work¬ 
ers to port their programs between differ¬ 
ent supercomputers. □ 


Acknowledgments 

This research was supported by grants from 
the National Science Foundation and the 
National Institutes of Health. 

The authors wish to thank the Amdahl Cor¬ 
poration for a generous grant of computer time 


and for the use of its facilities. One author, 
Anders Wallqvist, also wishes to thank John 
Straub and David Coker for their helpful 
comments. 


References 

1. E.J. Hart and M. Anbar, eds., The Hydrated 
Electron, Wiley, New York, 1970. 

2. J. Jortner and N.R. Kestner, eds.. Electrons 
in Fluids, Springer, New York, 1973. 

3. J. Kushick and B.J. Berne, “Molecular 
Dynamics Methods: Continuous Poten¬ 
tials,” in Modern Theoretical Chemistry, Vol. 
5, B.J. Berne, ed.. Plenum, New York, 1977. 

4. K.J.M. Moriarty, M. Haraguchi, and C. 
Pangali, “Efficient Implementation of the 
SU(3) Lattice Gauge Theory Algorithm on 
the Fujitsu VP200 Vector Processor, ’ ’ Comp. 
Phys. Comm., Vol. 34, No. 1, Jan. 1984, pp. 
1-7. 

5. A. Wallqvist, D. Thirumalai, and B.J. Berne, 
“Path-Integral Monte Carlo Study of the 
Hydrated Electron,” J. Chem. Phys. (in 
press). 

6. A. Wallqvist and B.J. Berne, “Path-Integral 
Simulation of Pure Water,” Chem. Phys. 
Lett., Vol. 117, June 1984, pp. 214-219. 

7. O. Teleman and B. Jonson, “Vectorizing a 
General-Purpose Molecular Dynamics 
Simulation Program,” J. Comp. Chem., 
Vol. 7, No. 1, Jan./Feb. 1986, pp. 58-66. 


COMPUTER 























Anders Wallqvist is a post-doctoral fellow in 
chemical physics at Columbia University. His 
research has dealt with the behavior of the sol¬ 
vated electron in different media, quantum 
effects in liquids, and models for molecular 
interactions. 

His undergraduate degree is from the Univer¬ 
sity of Lund, and he received a PhD in chemi¬ 
cal physics from Columbia University in 1986. 



Bruce J. Berne is a professor of chemistry at 
Columbia University. His research interests 
include the theory of liquids, molecular 
dynamics and Monte Carlo simulations of con¬ 
densed matter, theoretical chemical kinetics, 
chaos in molecular systems, dynamics and struc¬ 
ture of small molecular clusters, and simulations 
of quantum many-body systems. 

Berne serves on the editorial boards of the 
Journal of Physical Chemistry and Advances in 
Chemical Physics and is a fellow of the Ameri¬ 
can Physical Society. He received a BS from 
Brooklyn College in 1961 and a PhD in chemi¬ 
cal physics from the University of Chicago in 
1964. 



Chani Pangali is an employee of the Amdahl 
Corporation. His research interests are in 
geophysics, signal processing, and computer 
architecture. 

Pangali graduated with a BA from Merton 
College, Oxford, and received his PhD in 
chemistry from Columbia University in 1979. He 
is a member of the Society for Industrial and 
Applied Mathematics. 

Questions regarding this article may be 
addressed to Wallqvist at the Dept, of Chemis¬ 
try, Columbia University, New York, NY 10027. 


COMPUTER SECURITY 
SCIENTISTS 


Tomorrow’s Computing 
Technology is Today’s Challenge 


—at- 

D 




Some of the nation’s most excit¬ 
ing developments in software 
technology, supercomputer 
architecture, Al, and expert sys¬ 
tems are under scrutiny right 
now at the Institute for Defense 
Analyses. IDA is a Federally 
Funded Research and Develop¬ 
ment Center serving the Office of 
the Secretary of Defense, the 
Joint Chiefs of Staff, Defense 
Agencies, and other Federal 
sponsors. 

IDA’s Computer and Software 
Engineering Division (CSED) is 
seeking professional staff 
members with an in-depth 
theoretical and practical back¬ 
ground in the area of Computer 
Security. Tasks include efforts 
on both the design/development 
of techniques to assess and 
assure security and providing 
advice to DoD decision makers 
on appropriate and feasible 
policy regarding security. 

Specific desired skills and inter¬ 
ests include: 

• Formal verification, with 
emphasis on the Ada language 

• Secure kernels and reference 
monitors 

• Security in multiprocessor 
systems 

• Fault-tolerance in secure 

systems 

• Operating system, data base 
and network security criteria 

• Testing and evaluation 


Specialists in other areas of 
Computer Science are also 
sought: Software Engineers, Dis¬ 
tributed Systems, Artificial Intel¬ 
ligence and Expert Systems, and 
Programming Language 
Experts. 

We offer career opportunities at 
many levels of experience. You 
may be a highly experienced 
individual able to lead IDA proj¬ 
ects and programs ... or a 
recent MS/PhD graduate. You 
can expect a competitive salary, 
excellent benefits, and a superior 
professional environment. 

Equally important, you can 
expect a role on the leading edge 
of the state of the art in comput¬ 
ing. If this kind of future appeals 
to you, we urge you to investi¬ 
gate a career with IDA. Please 
forward your resume to: 

Mr. Thomas J. Shirhall 
Manager of Professional Staffing 
Institute for Defense Analyses 
1801 N. Beauregard Street 
Alexandria, VA 22311 
An equal opportunity employer. 
U.S. Citizenship is required. 



I DA 


May 1987 


















CSM 



Advance Announcement 

CONFERENCE ON 
SOFTWARE MAINTENANCE-1987 

AUSTIN, TEXAS 
SEPTEMBER 21-24,1987 


The Conference on Software Maintenance-1987 (CSM-87) will gather software managers, developers, maintainers, and 
researchers to discuss new solutions to the continuing challenge of software maintenance and software maintainability. 
CSM-87 will acquaint managers and practitioners with current advances and researchers with current needs. 


PH 


IEEE Computer Society- 

Technical Committee on Software Engineering 

^ The Institute of Electrical and Electronics Engineers, Inc. 


1 National Bureau of Standards (NBS) 


IN COOPERATION WITH: 

m ACM/SIGSOFT— 

acm Special Interest Group on Software Engineering 




Association I 


Software Main 


■ Women in Computing (AWC) 
enance Association (SMA) 


MONDAY, SEPTEMBER 21 . ALL DAY TUTORIALS 


I. Tutorial on Software Restructuring-Robert Arnold 

II. Advanced Practical Management Methods and Technologies Controlling Software 
Maintenance Cost, Productivity, and Maintainability-Tom Gilb 

III. C, UNIX and Software Portability-Alan Filipski 


GENERAL SESSIONS 
TUESDAY, SEPTEMBER 22 


KEYNOTE ADDRESS: 

Improving Software Maintenance Productivity, Barry Boehm 

TRACKB 


TRACK A 
Session 1A 

Metrics for Software Maintenance 
Management (Panel) 

Session 2A 

Empirical Studies Applying Control- 
Flow Metrics to COBOL 

Session 3A 

Formal Studies Statistical Rationale For 
Evolution Dynamics 


Session IB 

Management Techniques, Practices and 
Procedures 

Session 2B 

Legal Implications of Software 
Maintenance (Panel) 

Session 3B 

Software Maintenance Tools 


Session 4A-B 

Employing Standards To Improve Software Maintenance 


WEDNESDAY, SEPTEMBER 23 

PLENARY SESSION: 

Embedded Computer Software Maintenance Environment, James McCall 


TRACK A 
Session 5A 

Mini-Tutorial: Reducing Risks In Software 
Maintenance 

Session 6A 

Advanced Techniques For Software 
Maintenance 

Session 7A 

Status and Advances in Software 
Maintenance Education 


TRACKB 
Session 5B 

Software Restructuring and Portability 

Session 6B 

Mini-Tutorial: Control It Or Lose It: 
Responsive Configuration Management, 
A Must For Software Maintenance 

Session 7B 

Quantitative Evaluation of Maintenance 


THURSDAY, SEPTEMBER 24 


CONFERENCE COMMITTEE 


GENERAL CHAIR 
Roger J. Martin 
National Bureau 
of Standards 
Gaithersburg, MD 20899 
(301) 975-3295 
PROGRAM CO-CHAIRS 
Dr. Robert S. Arnold 
1880 Campus Common 
Drive, North 
Reston.VA 22091 
(703) 648-1880 
Wilma M. Osborne 
National Bureau of 
Standards 
Bldg. 225, Rm. B266 
Gaithersburg, MD 20899 
(301)975-3339 
VENDOR EXHIBITS CHAIR 
Gary L. Richardson 
Texaco Inc. 

Houston, TX 


TUTORIAL CHAIR 
Prof. Mel Colter 
Colter Enterprises 
Colorado Springs, CO 
PUBLICITY CHAIR 
Mary Anne Overman 
DOD 

Ft. Meade, MD 

LOCAL ARRANGEMENTS 

CHAIR 

Dr. Brian Fugate 
MCC 

Austin, TX 
IMMEDIATE PAST 
GENERAL CHAIR 
Nicholas Zvegintzov 
Staten Island, NY 


PROGRAM COMMITTEE 


Mr. David Beilin 
Pratt Institute 
Mr. Bruce Blum 
Johns Hopkins University 
Prof. Mel Colter 
Colter Enterprises 


Mr. Steve Oxman 
OXKO Corporation 
Mr. Donald A. Parker 
NASA-GSFC 
Dr. Dieter Rombach 
University of Maryland 
Mr. Gary C. Sackett 
Hugher Aircraft Co. 

Prof. Norman Schneidewind 
Naval Postgraduate School 
Ms. Dolores R. Wallace 
National Bureau of Standards 


For complete Advance Program, contact: 
Software Maintenance Conference 
c/o Computer Society of the IEEE 
1730 Massachusetts Ave. N.W. 

Washington, DC 20036-1903 
(202) 371-0101 


PLENARY SESSION: 

Perspectives On Software Maintenance I and II, Norman Schneidewind-Chair 









Conference on Software Maintenance 

September 21-24, 1987 
Austin, Texas 


Complete and return this form along with your check to: 

Software Maintenance 87 

c/o Computer Society 

1730 Massachusetts Ave. NW 

Washington, DC 20036-1903 

Telephone: (202) 371-1013 

Telex: 7108250437 IEEECOMPSO 


Circle appropriate fee: 

Advance Registration 
Prior to 9/4/87 

Member Non-Member 
Conference $160 $200 

Tutorial $160 $200 

□ Student Conference Fee: $50 

□ Check Enclosed 
□ VISA □ MasterCard □ American Express □ Choice 

Total Enclosed: $- 


Late Registration 
After 9/4/87 

Member Non-Member 
$190 $240 

$190 $240 


Card No:. 


Affiliation: — 
Mailing Addri 
City:- 


Exp. Date. 


Zip: 


Work Phone:-i-Home Phone:- 

Registration Fee includes proceedings and reception. Request for refunds must be received in writing no later than September 4,1987. 


La Mansion 

Hotel Reservation Form 

Conference on Software Maintenance 
September 21-24, 1987 


Complete this form and return to: 

La Mansion Hotel 

Reservations 

6505 1H-35 North 

Austin, Texas 78752 

(512) 454-3737 


Please check accommodations and price desired: 

□ Single occupancy: $74.00 

□ Double occupancy: $84.00 

Arrival Date--—- 

Departure Date___i---— 

□ am □ pm 

Reservations must be received by the La Mansion by September 1, 1987. 
Reservations received after September 1, 1987 will be accepted on a space 
available basis. 


All major credit cards accepted: □ VISA □ MasterCard □ American Express 


Card No. 


Exp. Date 


Signature 




Checkout: 12:00 p.m. 


Address 


City/State/Zip/Country 
































The Nature and Evaluation 
of Commercial 
Expert System Building Tools 


William B. Gevarter 
NASA Ames Research Center 


ESBTs make it 
possible to build an 
expert system in an 
order of magnitude 
less time than is 
possible with Lisp 
alone. This article 
reviews such tools. 


T he development of new expert sys¬ 
tems is changing rapidly—in 
terms of both ease of construction 
and time required—because of improved 
expert system building tools (ESBTs.) 
These tools are the commercialized deriva¬ 
tives of artificial intelligence systems devel¬ 
oped by AI researchers at universities and 
research organizations. It has been 
reported that these tools make it possible 
to develop an expert system in an order of 
magnitude less time than would be 
required with the use of traditional devel¬ 
opment languages such as Lisp. In this 
article, I review the capabilities that make 
an ESBT such an asset and discuss current 
tools in terms of their incorporation of 
these capabilities. 


The structure of an 
expert system building 
tool 

The core of an expert system consists of 
a knowledge base and an accompanying 
inference engine that operates on the 
knowledge base to develop a desired solu¬ 
tion or response. If one is to use such a sys¬ 
tem, an end-user interface or an interface 
to an array of sensors and effectors is 
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required for communication with the rele¬ 
vant world. (A “relevant world” is a system 
or situation operated on by or in contact 
with the expert system.) In addition, to 
facilitate the development of an expert sys¬ 
tem, an ESBT must also include an inter¬ 
face to the developer 

• so that the requisite knowledge base 
can be built for the particular appli¬ 
cation domain for which the system is 
intended, 

• so that the appropriate end-user inter¬ 
face can be developed, and 

• to incorporate any special instructions 
to the inference engine (reasoning sys¬ 
tem) that are required for the partic¬ 
ular domain. 

The character and quality of these inter¬ 
faces are two of the main differentiations 
between commercial tools and ESBTs 
developed at universities and used in 
research. Also important in the structure 
of ESBTs are 

• interfaces to other software and data¬ 
bases, and 

• the computers on which the ESBTs 
will run—not only the computers 
used for development of expert sys¬ 
tems, but also those used for their 
delivery to an end user. 

Figure 1 summarizes the structure of an 
ESBT. 
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Figure 1. The structure of an expert system 
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Figure 2. Different methods of knowledge representation. 


Knowledge 

representation 

The knowledge that can be easily repre¬ 
sented by the tool is a key consideration in 
choosing an ESBT. As indicated by Figure 
2, there are three aspects of knowledge rep¬ 
resentation that are fundamental to these 
tools— object descriptions (declarative 
knowledge such as facts), certainties, and 
actions. One method of representing objects 
is by frames with or without inheritance. 
(Inheritance allows knowledge bases to be 
organized as hierarchical collections of 
frames that inherit information from 
frames above them. Thus, an inheritance 
mechanism provides a form of inference.) 
Frames are tabular data structures for 
organizing representations of prototypical 
objects or situations. A frame has slots that 
are filled with data on objects and relations 
appropriate to the situation. One version 
of programming referred to as object- 
oriented programming utilizes objects that 
incorporate provisions for message passing 
between objects; attached to these objects 
are procedures that can be activated by the 
receipt of messages. Declarative knowledge 
can also be represented by parameter-value 
pairs, by use of logic notation, and, to 
some extent, by rules. 

Actions change a situation and/or 
modify the relevant database. Actions are 
most commonly represented by rules. 
These rules may be grouped together in 
modules (usually as subparts of the prob¬ 
lem) for easy maintenance and rapid 
access. Actions may also be represented in 
terms of examples, which indicate the con¬ 
clusions or decisions reached. Examples 
are a particularly desirable form of repre¬ 
sentation for facilitating knowledge acqui¬ 
sition, and inductive systems capitalize on 
them. Examples are much easier to elicit 
from experts than rules, and may often be 
a natural form of domain knowledge. 
Actions can also be expressed in logic nota¬ 
tion, which is a form of rule representation. 
Finally, actions can be expressed as proce¬ 
dures elicited by either 

• messages (in object-oriented pro¬ 
gramming) or 

• changes in a global database that are 
observed by demons. (“Demons” are 
procedures that monitor a situation 
and respond by performing an action 
when their activating conditions 
appear.) 

In addition to the representation of 
objects and actions, one must consider the 


degree to which the knowledge or data is 
known to be correct. Thus, most ESBTs 
have provisions for representing certainty. 
The most common approach is to incorpo¬ 
rate “confidence factors”; this approach 
is a derivative of the approach used in the 
Mycin expert system. 1 Fuzzy logic and 
probability are also used. 

An alternative way of handling uncer¬ 
tainties or tentative hypotheses is to con¬ 
sider multiple worlds in which different 
items are true or not true in these alterna¬ 
tive worlds. Another consideration is 
whether or not a deep model (which is a 
structural or causal model) of the system 
can readily be built with the tool in ques¬ 
tion as an aid in model-based reasoning. 
(The same underlying model can often be 
employed for other uses, such as preserva¬ 
tion of knowledge and training.) Finally, 


system size (for example, as measured by 
the number of rules needed) can be of crit¬ 
ical importance, as it can have an impor¬ 
tant effect on memory requirements, 
memory management, and runtimes. 

Inference engine 

Figure 3 indicates the major alternative 
means by which an ESBT performs 
inferencing. The most usual approach is 
classification, which is appropriate for sit¬ 
uations in which there is a fixed number of 
possible solutions. Hypothesized conclu¬ 
sions from this set are evaluated as to 
whether they are supported by the evi¬ 
dence. This evaluation is usually done by 
backward chaining through if-then (that is, 
antecedent-consequent) rules, starting with 
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Figure 3. Inference-engine 
possibilities. 
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rules that have the hypothesized conclu¬ 
sions as their consequents. Rules are then 
searched for those that have as their con¬ 
sequent the conditions that support the 
antecedents (input conditions) in the 
hypothesized conclusion rule. This process 
continues recursively until the hypothesis 
is fully supported or until either a negation 
or a dead-end is reached. If either of the 
latter two events happens, additional 
hypotheses may be tried until some conclu¬ 


sion is reached or the process is terminated. 
This depth-first, backward-chaining 
approach was popularized by the Mycin 
expert system. The corresponding Emycin 
ESBT shell 2 is the prototype of virtually 
all the hypothesis-driven (that is, goal- 
driven) commercial ESBTs currently 
available. 

Forward chaining starts with data to be 
input or with the situation currently pre¬ 
sent in a global database. The data or the 


situation is then matched with the antece¬ 
dent conditions in each of the relevant rules 
to determine the applicability of the rule to 
the current situation. (The current situa¬ 
tion is usually represented in the global 
database by a set of attributes and their 
associated values.) One of the matching 
rules is then selected (for example, by the 
use of meta-rules, which help determine the 
order in which the rules are tried, or by pri¬ 
orities), and the rule’s consequents are used 
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to add information to the database or to 
actuate some procedure that changes the 
global situation. Forward chaining pro¬ 
ceeds recursively (in a manner similar to 
that of backward chaining), terminating 
either when a desired result or conclusion 
is reached or when all relevant rules are 
exhausted. Combinations of forward and 
backward chaining have also been found 
useful in certain situations. 

Forward reasoning (a more general form 
of forward chaining) can be done with 
data-driven rules or with data-driven 
procedures (demons). 

Hypothetical reasoning refers to solution 
approaches in which assumptions may 
have to be made to enable the search pro¬ 
cedure to proceed. However, later along the 
search path, it may be found that certain 
assumptions are invalid and therefore have 
to be retracted. This nonmonotonic reason¬ 
ing (that is, reasoning in which facts or con¬ 
clusions must be retracted in light of new 
information) can be handled in a variety of 
ways. One approach that reduces the dif¬ 
ficulty of the computation is to carry along 
multiple solutions (these solutions repre¬ 
sent different hypotheses) in parallel and 
to discard inappropriate ones as evidence 
that contradicts them is gathered. This 
approach is referred to as viewpoints, con¬ 
texts, and worlds in different tools. Another 
approach is to keep track of the assump¬ 
tions that support the current search path 
and to backtrack to the appropriate branch 
point when the current path is invalidated. 
This latter approach has been referred to 
by names like nonchronological backtrack¬ 
ing A related capability is truth main¬ 
tenance, which removes derived beliefs 
when their conditions are no longer valid. 

Object-oriented programming is an 
approach in which both information about 
an object and the procedures appropriate 
to that object are grouped together into a 
data structure such as a frame. These 
procedures are actuated by messages that 
are sent to the object from a central con¬ 
troller or another object. This approach is 
particularly useful for simulations involv¬ 
ing a group of distinct objects and for real¬ 
time signal processing. 

The blackboard inference approach is 
associated with a group of cooperating 
expert systems that communicate by shar¬ 
ing information on a common data struc¬ 
ture that is referred to as a ‘ ‘blackboard. ’ ’ 
An agenda mechanism can be used to 
facilitate the control of solution develop¬ 
ment on the blackboard. 

In the case of ESBTs, logic commonly 
refers to a theorem-proving approach 


involving unification. “Unification” refers 
to substitutions of variables performed in 
such a way as to make two items match 
identically. The common logic implemen¬ 
tations are versions of a logic¬ 
programming language, Prolog, that uti¬ 
lize a relatively exhaustive depth-first 
search approach. 

An important inference approach found 
in some tools is the ability to generate rules 
or decision trees inductively from exam¬ 
ples. Human experts are often able to artic¬ 
ulate their expertise in the form of 
examples better than they are able to 
express it in the form of rules. Thus, induc¬ 
tive learning techniques (which are cur¬ 
rently limited in their expressiveness) are 
frequently ideal methods of knowledge 
acquisition for rapid prototyping when 
examples can be simply expressed in the 
form of a conclusion associated with a sim¬ 
ple collection of attributes. The human 
builders of the resultant expert system can 
then refine it iteratively by critiquing and 
modifying the results inductively 
produced. Inductive inference usually pro¬ 
ceeds by starting with one of the input 
parameters and searching for a tree featur¬ 
ing the minimum number of decisions 
needed to reach a conclusion. This 
minimum-depth tree is found by cycling 
through all parameters as possible initial 
nodes and using an information theoretic 
approach to select the order of the 
parameters to be used for the remaining 
nodes and to determine which parameters 
are superfluous. An “information the¬ 
oretic approach” is one that chooses the 
solution that requires the minimum 
amount of information to represent it. The 
depth of the tree is usually relatively shal¬ 
low (often less than five decisions deep), so 
large numbers of examples usually result in 
broad, shallow trees. 

Some tools incorporate demons that 
monitor local values and execute proce¬ 
dures when the actuation conditions of the 
demons appear. These tools are particu¬ 
larly appropriate for monitoring appli¬ 
cations. 

A number of tools offer a choice of 
several possible inference or search proce¬ 
dures. In systems built with such tools, 
means are usually made available to the 
system builder to control the choice of the 
inference strategy, which the builder causes 
to be dependent on the system state. Such 
control is referred to as meta-control. One 
form of meta-control is the use of control 
blocks, which are generic procedures that 
tell the system the next steps to take in a 
given situation so that the search will be 


reduced, enabling a large number of rules 
to be accommodated without the search 
space becoming combinatorially explosive. 

As the certainty of data, rules, and 
procedures is usually less than 100 percent, 
most systems incorporate facilities for cer¬ 
tainty management. Thus, they have vari¬ 
ous approaches for combining uncertain 
rules and information to determine a cer¬ 
tainty value for the result. 

Pattern matching is often required for 
mechanizing inference techniques, partic¬ 
ularly for matching rule antecedents to the 
current system state. The sophistication of 
the pattern-matching approach affects the 
capabilities of the system. Types of pattern 
matching vary—from matched identical 
strings to variables, literals, and wildcards, 
and can even include partial and/or 
approximate matching that can serve as 
analogical reasoning. 

Other ESBT capabilities vary from tool 
to tool. Some inference engines offer rapid 
and sophisticated math-calculation capa¬ 
bilities. One of the more valuable capabil¬ 
ities is supplied by inference engines that 
can manage modularized knowledge bases 
or modularized solution subproblems by 
accessing and linking these modules as 
needed. 

Another important consideration in a 
tool is the degree of integration of its vari¬ 
ous features. Full integration is desirable so 
that all the tool features can be brought to 
bear, if needed, on the solution of a single 
problem. For example, in the case of ESBTs 
incorporating both object representations 
and forward and backward chaining rules, 
it is desirable that expert-system developers 
be able to mix forward and backward 
chaining rules freely and be able to reason 
about information stored in objects when 
these actions are appropriate. 

The interface to the 
developer 

Various tools offer different levels of 
capabilities for the expert-system builder 
to use to mold the system. The simpler 
tools are shells into which knowledge is 
inserted in a specific, structured fashion. 
The more sophisticated tools are generally 
more difficult to learn, but allow the sys¬ 
tem developer a much wider choice of 
knowledge base representations, inference 
strategies, and the form of the end-user 
interface. Various levels of debugging 
assistance are also provided. Figure 4 pro¬ 
vides an indication of the possible options 


May 1987 


27 







Figure 5. Possibilities for the 
end-user interface. 
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(options are tool dependent) that are avail¬ 
able for each aspect of the interface to the 
developer. 

End-user interface 

Once the expert system has been built, 
its usability depends in large part on the 
end-user interface. Figure 5 provides an 
indication of the range of end-user facili¬ 
ties found in ESBTs. Since most expert 
systems are really intelligent assistants, the 
end-user interface is often designed to 
allow interactive dialogue. This dialogue 
and/or the initial input most often appear 
to the user as structured data-input 
arrangements incorporating menu choices 
that allow the user to answer requests by 
the system for information. In some cases, 
to increase system flexibility, systems will 


accept multiple and uncertain user 
responses and still arrive at conclusions 
(though the certainty of the resultant con¬ 
clusions is reduced). In sophisticated sys¬ 
tems, graphics are often used to show the 
line of reasoning when the system responds 
to users’ “how” questions; in simpler sys¬ 
tems, a listing of the rules supporting the 
system’s conclusions may be employed. 
ESBTs often answer a user’s “Why do you 
need this information?” question by quot¬ 
ing the rule for which the information is 
required. The ability of the system to 
answer the user’s “why” and “how” ques¬ 
tions is important, for it increases the end 
user’s confidence in the system’s decision¬ 
making ability. 

Other capabilities often found in ESBTs 
are facilities that allow the end user to 
select alternative parameter values and 
observe the effect on the outcome (these 


facilities support “what if” queries), facil¬ 
ities to allow the user to perform an initial 
pruning of the line of questioning so that 
the system need not pursue areas that the 
user feels are irrelevant or unnecessary, 
and the capability to save examples for 
future consideration or use. 

Very sophisticated tools often include 
interactive graphics and simulation facil¬ 
ities that increase the end user’s under¬ 
standing and control of the system being 
represented. Above all, the end-user inter¬ 
face needs to be user friendly if the system 
is to be accepted. 

Programming-language 

considerations 

In addition to the structure and the 
paradigms supported by a tool, the pro¬ 
gramming language in which the tool is 
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1. Classification 

a. Interpretation of measurements 

Hypothesis selection based on evidence 

b. Diagnosis 

Measurement selection and interpretation 

(often involves models of system organization and behavior) 

2. Design and synthesis 

Provide constraints as well as guidance 

3. Prediction 
Forecasting 

4. Use advisor 

"How to” advice 

5. intelligent assistant 

Provide decision aids 

6. Scheduling 

Time-ordering of tasks, given resource constraints 

7. Planning 

Many complex choices affect each other 

8. Monitoring 

Provide real-time, reliable operation 

9. Control 

Process control 

10. Information digest 
Situation assessment 

11. Discovery 

Generate new relations or concepts 

12. Debugging 

Provide conective action 

13. Example-based reasoning 
The source of most rules 


Figure 7. AI function capabilities. 


written is of major importance. The lan¬ 
guage determines whether the expert sys¬ 
tem is compilable and, if it is, whether 
incrementally or in a batch mode. Compil- 
ability reduces the memory requirements 
and increases the speed of the expert sys¬ 
tem; incremental compilability speeds 
development. Figure 6 is illustrative of the 
aspects related to the tool-language choice. 

In general, the more sophisticated tools 
have been written in Lisp. However, even 
these tools are now being rewritten in lan¬ 
guages such as C to increase speed, reduce 
memory requirements, and to promote 
availability on a larger variety of com¬ 
puters. However, some new approaches to 
mechanizing Lisp may reduce the speed 
and memory advantages associated with 
C. 

The user can usually extend tools writ¬ 
ten in Lisp by writing additional Lisp func¬ 
tions. This is also true of some of the other 
languages, for example, Prolog and Pas¬ 
cal. Similar extensibility is usually found 
in tools having language hooks for access¬ 
ing other programs or database hooks for 
accessing other information. In some 
cases, the expert system generated by the 
tool is fully embeddable in other systems, 
which produces increased autonomy. 
Whether or not a system is fully embed¬ 
dable in other systems and is therefore 
capable of autonomous operations is 
becoming increasingly important, now 
that expert systems are moving from pro¬ 
totypes to being fielded. Reliability and 
memory management (in Lisp, the latter 
takes the form of garbage collection) are 
often important considerations for fielded 
systems. “Garbage collection” is the col¬ 
lection of no-longer-used memory alloca¬ 
tions; these allocations can slow the system 
operation. 

The computers supported by the various 
tools are primarily a function of the lan¬ 
guage and operating system in which the 
tools are written, and the computer’s mem¬ 
ory, processing, and graphics-display capa¬ 
bilities. The trend toward making expert 
system shells available on personal com¬ 
puters (such as those made by IBM) results 
in part from the increasing capabilities of 
these computers. However, this trend is also 
partly owing to the writing of tools in faster 
languages, such as C, and to taking advan¬ 
tage of modularization in building the 
knowledge base. As mentioned earlier, 
such modularization involves decompos¬ 
ing the problem into subproblem modules 
and providing appropriate linking between 
these modules as required during 
operation. 


Function capabilities 

Of primary consideration are the func¬ 
tion applications that can readily be built 
with a particular ESBT. A review of the 
major function applications follows (see 
also Figure 7). 

Classification. By far the most common 
function addressed by expert systems is 
classification. “Classification” refers to 
selecting an answer from a fixed set of 
alternatives on the basis of information 
that has been input. 

Below are some subcategories of classi¬ 
fication. 

• Interpretation of measurements. This 
refers to hypothesis selection performed on 
the basis of measurement data and corol¬ 
lary information. 

• Diagnosis In diagnosis, the system not 
only interprets data to determine the dif¬ 
ficulty, but also seeks additional data when 
such data is required to aid its line of 
reasoning. 

• Debugging, treatment, or repair. These 
functions refer to taking actions or recom¬ 
mending measures to correct an adverse 
situation that has been diagnosed. 


• Use advisor. An expert system as a front 
end to a computer program or to a piece of 
machinery can be very helpful to the inex¬ 
perienced user. Such systems depend both 
on the goals of the user and the current sit¬ 
uation in suggesting what to do next. Thus, 
the advice evolves as the state of the world 
changes. Use advisors can also be helpful 
in guiding users through procedures in 
other domains (for example, auto repair 
and piloting aircraft). 

Classification and other function appli¬ 
cations can be considered to be of two 
types: surface reasoning and deep reasoning 
In surface reasoning, no model of the sys¬ 
tem is employed; the approach taken is to 
write a collection of rules, each rule assert¬ 
ing that a certain situation warrants a cer¬ 
tain response or conclusion. (These 
situation-response relationships are 
usually written as heuristic rules garnered 
from experience.) In deep reasoning, the 
system draws upon causal or structural 
models of the domain of interest to help 
arrive at the conclusion. Thus, systems 
employing deep models are potentially 
more capable and may degrade more 
gracefully than those relying on surface 
reasoning. 
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Table 1. A subjective view of the importance of various expert system tool attributes for 
particular function applications. 


INFERENCE APPROACH 


& FORWARD REASONING 


RWARD REASON. 


HYPOTHETICAL REASONING 


BLACKBOARD 


OBJECT DESCRIPTION 


PARAMETER W 


Design and synthesis. “Design and syn¬ 
thesis” refers to configuring a system on 
the basis of a set of alternative possibilities. 
The expert system incorporates constraints 
that the system must meet as well as gui¬ 
dance for steps the system must take to 
meet the user’s objectives. 

Intelligent assistant. Here the emphasis is 
on having a system that, depending on user 
needs, can give advice, furnish informa¬ 
tion, or perform various subtasks. 

Prediction. “Prediction” refers to fore¬ 
casting what will happen in the future on 
the basis of current information. This fore¬ 
casting may depend upon experience alone, 
or it may involve the use of models and for¬ 
mulas. The more dynamic systems may use 
simulation to aid in the forecasting. 

Scheduling. “Scheduling” refers to time¬ 
ordering a given set of tasks so that they 
can be done with the resources available 
and without interfering with each other. 

Planning. “Planning” is the selection of 
a series of actions from a complex set of 
alternatives to meet a user’s goals. It is 
more complex than scheduling in that tasks 
are chosen, not given. In many cases, time 


and resource constraints do not permit all 
goals to be met. In these cases, the most 
desirable outcome is sought. 

Monitoring. “Monitoring” refers to 
observing an ongoing situation for its 
predicted or intended progress and alerting 
the user or system if there is a departure 
from the expected or usual. Typical appli¬ 
cations are space flights, industrial 
processes, patients’ conditions, and enemy 
actions. 

Control. Control is a combination of 
monitoring a system and taking appropri¬ 
ate actions in response to the monitoring 
to achieve goals. In many cases, such as the 
operation of vehicles or machines, the 
tolerable response delay may be as small as 
milliseconds. In such a case, the system 
may be referred to as a real-time system. 
“Real-time” is defined as “responding 
within the permissible delay time” to the 
end that the system being controlled stays 
within its operating boundaries. 

Digest of information. A system perform¬ 
ing this function may take in information 
and return a new organization or synthe¬ 
sis. One application may be the inductive 
determination of a decision tree from 


examples. Others may be the assessment of 
military or stock market situations on the 
basis of input data and corollary infor¬ 
mation. 

Discovery. Discovery is similar to digest 
of information except that the emphasis is 
on finding new relations, order, or con¬ 
cepts. This is still a research area. Examples 
include finding new mathematical con¬ 
cepts and elementary laws of physics. 

Others. There are other functions, such 
as learning, that are directly subsumable 
under the ones I have enumerated thus far. 
In many cases, these functions (and some 
of those already mentioned) can be ingeni¬ 
ously decomposed into functions discussed 
previously. Thus, for example, design and 
some other functions can often be sepa¬ 
rated into subtasks that can be solved by 
classification. 


Importance of various 
ESBT attributes for 
particular function 
applications 

Table 1 * is an attempt to relate the var¬ 
ious attributes that are found in different 
ESBTs to their importance in facilitating 
the building of expert systems that per¬ 
form different functions. A solid circle 
indicates an attribute that is very worth¬ 
while in helping to build that function. An 
open circle indicates that it is a lesser con¬ 
tributor. A empty cell indicates an attrib¬ 
ute that does not provide a significant 
contribution. As indicated earlier, the 
evaluation is subjective because, depend¬ 
ing on the insight and ingenuity of the sys¬ 
tem developer, some of the functions can 
be decomposed into other functions. 
Thus, Table 1 reflects what I see as 
obvious and perhaps necessary attributes 
for straightforward construction of expert 
systems that perform the indicated 
functions. 


*ln the future, various ESBT approaches may be 
shown to be Turing Machine equivalents, which 
would mean that any computation could be per¬ 
formed by them. Therefore, it usually cannot be 
said definitively that ESBT x cannot perform 
function y. Thus, Table 1 in the sidebar is really 
an attempt to reflect my perception of which 
ESBT attributes simplify the programming of 
various expert-system functions. 
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Brief descriptions of commercial ESBTs 


The following are descriptions of some of the current com¬ 
mercial expert system building tools in common use. The 
attributes of these tools are summarized in Tables 1 and 2 for 
easy comparison. This sidebar is not intended to be an 
exhaustive survey. For example, VP-Expert, an inexpensive 
(under $100) but capable rule-based ESBT for PCs, has 
recently been introduced by Paperback Software in Berkeley, 
Calif. GEST, an evolving university-supported ESBT from Geor¬ 
gia Tech, provides high-order capabilities (such as multiple 
knowledge representations) at a fraction of the cost of com¬ 
mercial, more polished tools offering similar capabilities. 
GURU, from mdbs in Lafayette, Ind., a composite ESBT inte¬ 
grated with a database spreadsheet and natural-language 
front end, is also available. 


ART 

ART is a versatile tool that incorporates a sophisticated pro¬ 
gramming workbench. It runs on advanced computers and 
workstations such as those produced by Symbolics, LMI, Tl, 
Apollo, and VAX. ART’s strong point is viewpoints, a technique 
that allows hypothetical nonmonotonic reasoning; in non¬ 
monotonic reasoning, multiple solutions are carried along in 
parallel until constraints are violated or better solutions are 
found. At such points, inappropriate solutions are discarded. 
ART provides graphics-based interfaces for browsing both its 
viewpoint and schema (frame) networks. ART is primarily a 
forward-chaining system with sophisticated user-defined pat¬ 
tern matching; the pattern matching is based on an enhanced 
version of an indexing scheme derived from OPS5. (OPS5 is 
discussed below.) Object-oriented programming is made avail¬ 
able by attaching procedures (active values) to objects (the 
objects are called schemata). ART has a flexible graphics 
workbench with which to create graphical interfaces and 
graphical simulations. ART was designed for near-real-time 
performance. To achieve this performance, ART compiles its 
frame-based as well as its relational knowledge into logic-like 
assertions (the latter are called discrimination networks). 
Applications particularly suited for ART are planning/schedul¬ 
ing, simulation, configuration generation, and design. Cur¬ 
rently written in Lisp, ART employs a very efficient, unique 
memory management system that virtually eliminates gar¬ 
bage collection. A C-language version is now available. (Fur¬ 
ther information on ART is available from Inference Corp., 

5300 W. Century Blvd., Los Angeles, CA 90045; (213) 417-7997.) 

KEE 

Kee, which runs on advanced Al computers, is the most 
widely used programming environment for building sophisti¬ 
cated expert systems. Important aspects of KEE are its mul¬ 
tifeature development environment and end-user interfaces, 
which incorporate windows, menus, and graphics. KEE con¬ 
tains a sophisticated frame system that allows the hierarchi¬ 
cal modeling of objects and permits multiple forms of 
inheritance. KEE also offers a variety of reasoning and analy¬ 
sis methods, including object-oriented programming, forward 
and backward chaining of rules, hypothetical reasoning 
(which is incorporated as KEE Worlds), a predicate-logic lan¬ 
guage, and demons. It has an open architecture that supports 
user-defined inference methods, inheritance roles, logic oper¬ 
ators, functions, and graphics. KEE has a large array of 
graphics-based interfaces that are developer/user controlled, 


including facilities for graphics-based simulation (the 
graphics-based simulation facility, Sim Kit, is available at 
extra cost). KEE has been used for applications in diagnosis, 
monitoring, real-time process control, planning, design, and 
simulation. (Further information on KEE is available from 
IntelliCorp, 1975 El Camino Real West, Mountain View, CA 
94040; (415) 965-5500.) 

Knowledge Craft 

Knowledge Craft (KC) is a hybrid tool based on frames that 
have user-defined inheritance. It is an integration of the Car¬ 
negie Mellon version of OPS5 and of Prolog and the SRL 
frame-representation language. It is a high-productivity tool kit 
for experienced knowledge engineers and Al system builders. 
Frames are used for declarative knowledge; procedural knowl¬ 
edge is implemented by the attaching of demons. KC is capa¬ 
ble of hypothetical (nonmonotonic) reasoning when Contexts 
(a facility offering alternative worlds) is employed. Search is 
user defined. A graphics-based simulation package (Simula¬ 
tion Craft) is available. Designed to be a real-time system, KC 
is particularly appropriate for planning/scheduling and, to an 
extent, is appropriate for process control, but it is something 
of an overkill where simple classification problems are con¬ 
cerned. (Further information on Knowledge Craft is available 
from Carnegie Group, Inc., 650 Commerce Court, Station 
Square, Pittsburgh, PA 15219; (412) 642-6900.) 

Picon 

Picon is designed as an object-oriented expert system shell 
for developing real-time, on-line expert systems for industrial 
automation and other processes that are monitored with sen¬ 
sors during real time; such processes are found in some 
aerospace and financial applications. Picon operates on the 
LMI Lambda/Plus Lisp machine and the Tl Explorer, which 
combine the intelligent processing power of a Lisp processor 
with the high-speed numeric processing and data-acquisition 
capabilities of an MC68010 processor. The two processors 
operate simultaneously, enabling Picon to monitor the system 
in real time, detect events of possible significance in process, 
diagnose problems, and decide on an appropriate course of 
action. Picon’s icon editor and graphics-oriented display ena¬ 
ble a developer with minimal Al training to construct and rep¬ 
resent a deep model of the process being automated. Rules 
about the process are entered by means of a menu-based 
natural-language interface. Picon supports both forward and 
backward chaining. (Further information on Picon is available 
from Lisp Machine, Inc., 6 Tech Dr., Andover, MA 01810; (617) 
669-3554.) 

S.l 

S.1 is a powerful commercial ESBT aimed at structured clas¬ 
sification problems. Facts are expressed in a frame represen¬ 
tation; judgment-type knowledge is expressed as rules. 

Though ostensibly a backward-chaining system, S.l performs 
forward reasoning by means of a patented procedural control 
block technique. Control blocks can be viewed as implementa¬ 
tions of flow diagrams; they guide the system procedure by 
telling the system the next step to take in the current situa¬ 
tion. Control blocks can invoke other control blocks or rules, 
or they can initiate interactive dialogue. Control blocks are a 
powerful, knowledge-based means of controlling the search, 


May 1987 





Table 1. Attributes of some commercial ESBTs. (This table is continued on page 34.) 
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C. PASCAL LISP 
DBASE III DBASE II 


Most PASCAL, 

BASIC 


S1.9K PC 
others 


$100 PC 
$500 

UNIX/VMS 
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and thus they have made it possible for one to write programs 
containing thousands of rules without being overwhelmed by 
a combinatorial explosion (runtimes tend to be linear with the 
number of rules). S.1 is written in C and executes very rapidly. 
A major advantage of S.1 is that it can readily be integrated 
into existing software. A delivery version is available without 
the system-development portion of S.1; the delivery version 
can be completely embedded in applications. S.1 is not aimed 
at exploratory programming; it is aimed at commercial appli¬ 
cations in which iterative development of solutions to solva¬ 


ble problems is desired. S.1 has an excellent user interface 
that features mouse-driven, graphical representations of both 
the knowledge bases and the inference traces. Problems can 
be solved in terms of subproblems, which can be linked to 
handle the complete problem (consistency checking is per¬ 
formed as part of linkage). All S.1 features are expressed in an 
integrated, strongly typed, block-structured language that 
facilitates system development and long-term maintenance. 
(Further information on S.1 is available from Teknowledge, Inc., 
1850 Embarcadero Rd., PO Box 10119, Palo Alto, CA 94303; 

(415) 424-0500.) 



O - OTHERS 
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ES Environment/VM or MVS 
(ESE/VM or ESE/MVS) 

ESE is an improved version of Emycin; it is designed for 
classification problems, but does allow for forward chaining. 

It consists of two components: a development interface and a 
consultation interface. A Focus Control Block mechanism has 
been added to allow the developer to modify and control the 
flow of inference and, thus, to increase the system speed. 
ESE/VM and ESE/MVS have good utilities for enabling the 



TP - Tl PC 


I - IBM PCs 
S - SYMBOLICS 


„V - MICROVAX 
Su - SUN 
A — APOLLO 
O - OTHERS 


developer to fashion the user interface and to incorporate 
graphics in the user interface when appropriate. ESE is partic¬ 
ularly suitable for IBM mainframe users who must interface 
with existing software and databases. (Further information on 
ESE is available from IBM, Dept. M52, 2800 Sand Hill Rd., 
Menlo Park, CA 94025; (415) 858-3000.) 

Envisage 

Envisage is a Prolog-derived tool. Thus, instead of entering 
rules, one enters logical assertions. Non-Prolog features 
include demons, fuzzy logic, and Bayesian probabilities. 
Envisage is primarily aimed at classification problems. (Fur¬ 
ther information on Envisage is available from System 
Designers Software, Inc., 444 Washington St., Suite 407, 
Woburn, MA 01801; (617) 935-8009.) 

KES 

KES is a three-paradigm system that supports production 
rules, hypothesize-and-test rules (hypothesize-and-test rules 
use the criterion of minimum set coverage to account for 
data), and Bayesian-type rules for domains in which knowl¬ 
edge can be represented probabilistically. KES is primarily 
geared to classification-type problems. KES can be embedded 
in other systems. The hypothesize-and-test approach starts 
with a knowledge base of diagnostic conclusions (that is, 
classifications) with their accompanying symptoms (also 
called “characteristics”). The session begins with the selec¬ 
tion by the system of the set of all diagnoses that match the 
first symptom of the given problem; the system then reduces 
this set as the remaining problem symptoms are considered. 

If the initial set of diagnoses does not include all the remain¬ 
ing symptoms, new diagnoses are added to the set to cover 
these cases. (Further information on KES is available from 
Software Architecture and Engineering, Inc., 1600 Wilson 
Blvd., Suite 500, Arlington, VA 22209-2403; (703) 276-7910.) 


M.1 is a PC-based ESBT targeted for solvable problems 
rather than for exploratory programming. It is a basically a 
backward-chaining system designed for classification. It 
includes the capability for meta-level commands that direct 
forward reasoning. Written in C, it can readily be integrated 
into existing conventional software. Its main drawback is that 
it has no true object-description capability and therefore can¬ 
not readily support deep systems. However, M.1 does have a 
good set of development tools and developer- and user- 
friendly interfaces. (Further information on M.1 is available 
from Teknowledge, Inc., 1850 Embarcadero Rd., PO Box 10119, 
Palo Alto, CA 94303; (415) 424-0500.) 

Nexpert Object 

Nexpert Object is a powerful, rule-based tool coded in C to 
run on a Macintosh with 512K of RAM, the Mac Plus, or the 
IBM PC AT. It has editing facilities comparable with those 
found on a large tool designed to run on the more sophisti¬ 
cated Al machines. The system allows the developer to group 
rules into categories so that the rules need be called up only 
when they are appropriate. Nexpert Object supports variable 
rules and combinations of forward and backward chaining. 
The system can automatically generate graphical representa¬ 
tions of networks of rules; these representations of networks 
indicate how the rules relate to one another. Similar networks 
can be generated to show rule firings that take place in 


May 1987 


37 







































response to a particular consultation. Nexpert Object 
includes the capabilities of both frame representations that 
have multiple inheritance and of pattern-matching rules, so 
deep reasoning is facilitated. Nexpert Object is a sophisti¬ 
cated system with a focus on the graphical representation of 
both the knowledge bases and the reasoning process, which 
makes possible natural and comprehensible interfaces for 
both the developer and end user. (Further information on Nex¬ 
pert Object is available from Neuron Data Corp., 444 High St., 
Palo Alto, CA 94301; (415) 321-4488.) 

Personal Consultant + (PC+) 

PC+ is an attempt to provide on a personal computer many 
of the advanced features found in more sophisticated tools; 
such tools include KEE. Thus, PC+ utilizes frames with attrib¬ 
ute inheritance, and rules. PC+ supports the backward¬ 
chaining approach derived from Emycin. It also includes 
forward-chaining capabilities without variable bindings. PC + 
has an extensive set of tools for both development and execu¬ 
tion that incorporate user-friendly interfaces. The new 2.0 ver¬ 
sion supports up to 2M bytes of expanded or extended 
memory for increased knowledge-base capacity. It also sup¬ 
ports the IBM Enhanced Graphics Adapter and access to the 
popular dBase II and III database packages on the IBM PC. A 
version of PC+ is also available for the Tl Explorer Lisp 
Machine. PC Easy, a simplified version of PC+ without 
frames, is also offered. (Further information on PC+ is availa¬ 
ble from Texas Instruments, Inc., PO Box 209, MS 2151, Austin, 
TX 78769; (800) 527-3500.) 

Exsys 3.0 

Exsys 3.0 is written in C for PCs as an inexpensive, rule- 
based, backward-chaining system oriented toward 
classification-type problems. Rules are of the if-then-else 
type. Exsys includes a runtime module and a report generator. 
Exsys can interface to the California Intelligence company’s 
after-market products: Frame (to provide frame-based knowl¬ 
edge representation) and Tablet (to provide a blackboard 
knowledge-sharing facility that incorporates tables). (Further 
information on Exsys 3.0 is available from Exsys, Inc., PO Box 
75158, Contr. Sta. 14, Albuquerque, NM 87194; (505) 836-6676.) 

Expert Edge 

Expert Edge is basically a rule-based, backward-chaining 
system aimed at rapidly prototyping and delivering classifica¬ 
tion applications in the 50-to-500 rule range. It uses probabili¬ 
ties and Bayesian statistics to handle uncertainties and lack 
of information. Its outstanding features are its excellent 
developer and end-user interfaces, which feature pop-up win¬ 
dowing environments. These are accompanied by a natural- 
language interface and very good debugging facilities. The 
professional version interfaces with a video disk and is also 
able to do extended mathematical calculations. (Further infor¬ 
mation on Expert Edge is available from Human Edge Soft¬ 
ware Corp., 1875 S. Grant St., San Mateo, CA 94402-2669; (415) 
573-1593.) 


ESP Advisor and ESP Frame-Engine 

ESP is a Prolog-based system that is particularly appropri¬ 
ate for designing expert systems that guide an end user in per¬ 
forming a detailed operation involving technical skill and 
knowledge. The developer builds the system by programming 
in KRL (Knowledge Representation Language), a sophisti¬ 
cated and versatile language that supports numeric and string 
variables, including facts, numbers, categories, and phrases. 
Prolog’s heritage is clearly apparent in the system’s ability to 
support a full set of logic operators, which enables the 
developer to write efficient, complex rules. The ESP consulta¬ 
tion shell offers a well-designed, multipanel display that 
makes good use of color. A text-animation feature allows the 
developer to insert text at any point in a consultation. Though 
ESP Advisor was designed as an introductory prototype tool, 
its extensibility makes expert systems of greater complexity 
possible. ESP Frame-Engine supports frames with inheritance, 
forward and backward chaining rules, and demons. (Further 
information on the ESP products is available from Expert Sys¬ 
tems International, 1700 Walnut St., Philadelphia, PA 19103; 
(215) 735-8510.) 

Insight 2 + 

Insight 2+ is primarily a rule-based, backward-chaining 
(that is, goal-driven) system, but it can support forward chain¬ 
ing as well. Facts are represented as elementary objects with 
single-value or multivalue attributes. Rules are entered in PRL 
(Production Rule Language). The knowledge base is compiled 
prior to runtime. Uncertainty is handled by means of confi¬ 
dence factors and thresholds. Because Insight 2+ lacks 
methods for representing deep models, it is best used for heu¬ 
ristic problems, for which it is a useful tool. Its ability to 
access external programs and databases is a major enhance¬ 
ment. (Further information on Insight 2+ is available from 
Level Five Research, Inc., 503 Fifth Ave., Indialantic, FL 32903; 
(305) 729-9046.) 

TIMM 

TIMM is an inductive system that builds rules from exam¬ 
ples. Examples are first translated into rules, which are then 
used to build more powerful generalized rules. TIMM handles 
contradictory examples by arriving at a certainty that is based 
on averaging these examples’ conclusions. Partial-match ana¬ 
logical inferencing is used to deal with incomplete or non¬ 
matching data. TIMM indicates the reliability of its results. 

The expert systems that result from it can be embedded in 
other software programs. (Further information on TIMM is 
available from General Research Corp., 7655 Old Springhouse 
Rd„ McLean, VA 22102; (703) 893-5900.) 

Rulemaster 3.0 

Though Rulemaster is capable of independent forward and 
backward chaining, its major distinguishing feature is its 
capability for inductively generating rules from examples. It 
also offers fuzzy logic. Interaction with the knowledge base is 
accomplished by means of a text editor. If they prefer, knowl¬ 
edge engineers can develop Rulemaster applications by writ¬ 
ing code directly in the high-level Radial language of Rulemaster 
instead of using examples. However, a strong programming 
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background is required for easy usage. Rulemaster can gener¬ 
ate C or Fortran source code for fast execution, compactness, 
and for creation of portable expert systems that can interface 
to other computer programs. (Further information on 
Rulemaster is available from Radian Corp., 8501 Mo-Pac Blvd., 
PO Box 9948, Austin, TX 78766; (512) 454-4797.) 

KDS3 

KDS3 inductively generates rules from examples. Examples 
can be grouped to develop knowledge modules, which KDS 
calls frames and which can be chained together to form very 
large systems. Both forward and backward chaining are sup¬ 
ported. KDS3 can take input from external programs and sen¬ 
sors and can drive external programs. Expert systems built 
with KDS3 can be made either (a) interactive or (b) fully auto¬ 
matic for intelligent process control. The entire system is writ¬ 
ten in assembly language for very rapid execution on PCs. 
Graphics can be incorporated automatically from picture files 
or, if one makes use of built-in KDS3 color graphics primitives, 
they can be drawn in real time. KDS3 incorporates a black¬ 
board by means of which knowledge modules can communi¬ 
cate. KDS2 without the blackboard facility is also available. 
(Further information on KDS3 is available from KDS Corp., 934 
Hunter Rd., Wilmette, IL 60091; (312) 251-2621.) 

lst-Class 


Attributes of particular 
commercial ESBTs 

The sidebar entitled “Brief descriptions 
of commercial ESBTs” presents some of 
the better-known commercial ESBTs. 
Attributes of these ESBTs are listed in 
Table 1 of that sidebar. Inclusion of an 
ESBT in the sidebar in no way represents 
an endorsement of that product. The 
descriptions and listings have been con¬ 
structed from company and noncompany 
literature, discussions with company 
representatives, demonstrations, explora¬ 
tory use of the tools, and so on. However, 
some incompleteness, errors, and over¬ 
sights are inevitable in such an endeavor, 
so it behooves the interested person to use 
this material as a guide and to examine the 
systems directly. Direct examination is 
particularly important because increasing 
competition is forcing ESBT developers to 
make rapid improvements and changes in 
both their systems and their prices. 


This is an induction system that generates decision trees, 
which are elaborate rules, from examples given in spreadsheet 
form. Problems can be broken down into modules derived 
from sets of examples; the modules can be chained together 
with forward or backward chaining. Rules can also be 
individually built or edited in graphical form on the screen. 
Several algorithms are available for inferencing: The system 
can match queries to examples that exist in the database, or 
the system can utilize the rule trees either as generated or in 
the preferred mode, which employs optimized rule trees that 
ask questions in the best order. Because all the rules are com¬ 
piled, the system is very fast. The lst-Class induction system 
is designed to interface readily with other software. (Further 
information on lst-Class is available from Programs in Motion, 
Inc., 10 Sycamore Rd., Wayland, MA 01778; (617) 653-5093.) 

OPS5 

Various versions of the OPS5 expert-system-development 
language, developed at Carnegie Mellon University, are availa¬ 
ble. OPS5 is a forward-chaining, production-rule tool with 
which many famous expert systems used at DEC, such as 
R1/XCON, have been built. OPS5 pattern-matching language 
permits variable bindings. However, OPS5 does not have facili¬ 
ties for sophisticated object representations. In general, the 
development environment is unsophisticated, although some 
debugging-and-tracing capability is usually provided. The use 
of a sophisticated indexing scheme (the Rete algorithm) for 
finding rules that match the current database makes OPS5 
one of the tools that executes the fastest. Unfortunately, it is 
not an easy tool for the nonprogrammer to use. Variations of a 
representative version, OPS5+, can be obtained for the IBM 
PC, Macintosh, and the Apollo Workstation. (Further informa¬ 
tion on OPS5 is available from ComputerThought Corp., 1721 
West Plano Pkwy., Suite 125, Plano, TX 75075; (214) 424-3511.) 


A comparative, com¬ 
posite view of the 
various tools 

Table 2 of the sidebar provides a com¬ 
posite view of the various ESBTs. Many of 
the attributes have been integrated to pro¬ 
vide a more easily understandable picture 
of the capability of the tools in each sub¬ 
category (for example, the rule and proce¬ 
dure attributes have been combined into 
“representation of actions”). A solid cir¬ 
cle indicates that the tool appears to be 
strong in a subcategory, an open circle 
indicates that it appears to be fair, and an 
empty cell indicates little or no capability 
in that area. Note that by relating each 
tool’s attributes to its functional impor¬ 
tance, I have attempted to indicate each 
tool’s suitability for developing various 
function applications. Also, note that the 
more expensive (and correspondingly 
more sophisticated) tools have the widest 
applicability. This is often because they are 
a collection of different paradigms incor¬ 
porated into a single tool. As a result, they 
may often be regarded as higher order pro¬ 
gramming languages and environments, 
instead of as simple shells into which infor¬ 
mation is inserted to create an expert sys¬ 
tem directly. The shell model is more 
nearly true of the simpler induction sys¬ 
tems; such systems can be considered as 
knowledge-acquisition and rapid¬ 
prototyping tools from which more com- 
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Figure 8. Considerations for assessing the overall usability of a tool. 


plex systems can be built by means of other 
tools by enlarging upon the simple rules 
inductively generated. 

Overall usability of a 
tool 

Figure 8 summarizes some of the aspects 
that enter into the critical ESBT attribute 
“overall usability of a particular tool. ” In 
addition to obvious factors such as costs 
and function applicability (function 
applicability is a measure of which func¬ 
tions are easily accomplished with a tool 
and which are difficult to accomplish with 
it), tool choices should be guided by the 
size of the system to be built, how rapidly 
a system of the given size and complexity 
can be built with the tool, and the speed of 
operation of the tool both during develop¬ 
ment and, particularly, during end use. 
(During end use, sub-elements of the tool 
act as a software delivery vehicle for the 
developed expert system.) Perhaps the 
most important factor, however, is the 
degree of satisfaction of both the 


developer and the end user. This is related 
to how obvious the uses of the tool features 
are, how direct the lines of action to the 
user’s or developer’s goals are, the control 
the developer and end user sense that they 
have over the system, the nature of the 
interaction or display (for example, 
whether they take place by means of menu 
or graphics), how easy it is to recover from 
errors, the on-line help that is furnished, 
and the perceived esthetics, reasonable¬ 
ness, and transparency of the system. Also 
of major importance is how easy it is to 
learn the system. This often depends on 
many of the factors already discussed, but 
is also closely related to how apparent the 
choice is at each step (for example, the 
apparent choice when menus are used is 
different from the apparent choice when 
programming is required), the quality of 
the documentation and on-line help, and 
the ESBT’s structure. Manufacturer- 
sponsored courses help; however, these are 
often expensive and inconvenient. A 
related factor is manufacturer support of 
the tool, particularly the availability of 
help over the telephone when it is required. 


Finally, such factors as the system’s 
portability, the computers it will run on, 
the delivery environment, the system’s 
capability of interfacing with other pro¬ 
grams and databases, and whether the 
developed system can be readily embedded. 
in a larger system are all important in an 
evaluation of a tool. A more difficult fac¬ 
tor to evaluate is the ease of prototyping 
versus life cycle cost. As prototypes are 
expanded into fielded systems and as they 
are iteratively further expanded and 
updated, difficulties are often encountered 
in system stability, runtime, and memory 
management. 

Though many of these factors can be 
deduced from the tool’s specifications and 
from system demonstrations, in many 
cases one can properly differentiate 
between two tools intended for the same 
application only if he or she learns both 
systems and attempts to build the same set 
of applications with each one. Neverthe¬ 
less, the factors described in this article and 
the initial evaluation furnished in the side- 
bar should prove useful as initial guides to 
potential users. 

T o date, ESBTs have made possi¬ 
ble productivity improvements of 
an order of magnitude or more in 
constructing expert systems. Current tools 
are only forerunners of ESBTs yet to 
come. The trend is toward less expensive, 
faster, more versatile, and more portable 
tools that will readily make possible devel¬ 
opment of expert systems that can directly 
communicate with existing conventional 
software such as databases and spread¬ 
sheets, and can also be embedded into 
larger systems, with resulting autonomous 
operations. Higher-end ESBTs are now 
moving from Lisp machines to more con¬ 
ventional workstations that are less expen¬ 
sive. Lower-end systems are becoming 
more capable and now appear on IBM 
PCs and Macintoshes. Delivery systems, 
which utilize a subset of the complete 
ESBTs (ESBTs with the development por¬ 
tion removed) are now allowing the com¬ 
pleted expert system to be delivered on 
personal computers or workstations. In 
addition, versatility will be enhanced with 
increased choices of inference engines such 
as blackboard systems. Also in the works 
are modular ESBTs that will allow the 
developer to choose various knowledge 
representations and inference techniques 
as he or she desires and still be able to build 
an integrated system. Already appearing 
are ESBTs coupled to other software sys- 
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terns such as databases and spreadsheets. 
Also beginning to appear are expert sys¬ 
tems that are specialized to specific func¬ 
tions such as scheduling, process control, 
and diagnosis. 

Finally, the developer and end-user 
interfaces are getting friendlier and more 
capable. One of the things providing 
greater capability is the increased use of 
graphics and graphical simulations. It is 
expected that as these friendlier systems 
emerge, there will be increased develop¬ 
ment of expert systems directly by the 
experts themselves. 

The rich and growing variety of ESBTs 
may make it more difficult to choose a 
tool, but if it is properly selected, the tool 
will be more closely matched to developer 
and end-user needs. □ 
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I n the last few years we have seen an 
explosion in the interest in and avail¬ 
ability of parallel processors 1 and a 
corresponding expansion in applications 
programming activity. Clearly, applica¬ 
tions programmers need tools to express 
the parallelism, either in the form of sub¬ 
routine libraries or language extensions. 
There has been no shortage of providers of 
such tools. 2 ' 7 These tools allow the appli¬ 
cations programmer to express the paral¬ 
lelism explicitly. There are also projects 
underway to provide automatic paralleli¬ 
zation of sequential code. 8 ' 10 

While this article presents language 
extensions in use at the end of 1985, it is 
not comprehensive. I will concentrate on 
the following aspects of the problem: 

• Scientific programming: Character¬ 
ized by a high degree of floating-point 
computation, usually coded in Fortran. 
The discussion is not relevant to logic pro¬ 
gramming such as AI applications 
expressed in Lisp or Prolog. 

• General-purpose machines: Machines 
capable of running a wide variety of appli¬ 
cations with reasonable efficiency. In 
special-purpose machines such as image, 
signal, and array processors, and systolic 
arrays the algorithms are coded into the 
hardware. Even general-purpose array 
processors rarely exhibit the parallelism of 
interest here. 

• MIMD processors: Multiple- 
instruction, multiple-data machines, not 
vector processors, which are single¬ 
instruction, multiple-data (SIMD) 
machines. I treat the vector unit, if avail¬ 
able, in much the way I treat the floating¬ 
point multiplier, as another functional 
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unit. I will not discuss parallel SIMD 
machines here, like the Illiac IV, 11 ICL 
DAP, 12 MPP, 13 and Connection 
Machine. 14 Nor will I discuss microin¬ 
struction machines. For example, each 
processor in a systolic array 15 is not a 
complete computer in the sense that an 
entire application cannot be run on it. 

• Moderately parallel systems: No 
more than tens of processors. The tools 
described leave it up to the programmer to 
distribute the work, even to the extent of 
having a different program running on 
each processor. Massively parallel systems 
of the order of a thousand or more proces¬ 
sors are quite different. At this time there 
are no general-purpose, MIMD machines 
in this class that are widely available, so no 
one has experience programming them. 

• Fortran: The most commonly used 
language for scientific computers. Since 
most of the work to date has been done in 
Fortran, and because most scientific 
programmers are familiar with it, I use 
Fortran for all the examples. It is possible 
to use these constructs in other languages, 
but “modern” languages like Pascal and 
Ada that attempt to eliminate side effects 
are quite different. Side effects represented 
by Fortran COMMON data present 
important advantages and problems for 
the application programmer. 

• Explicitly declared parallelism: Par¬ 
allelism controlled by the programmer. I 
describe a style in which the programmer 
controls the parallelism. Even though 
great strides have been made in automatic 
parallelization, the only automatic system 
available to date 10 is limited to individual 
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loops. Parallelism at a larger granularity 
must still be specified by the programmer. 

Rather than enumerate all possible ways 
of describing a particular type of parallel¬ 
ism, I use generic notation. This notation 
indicates how the parallelism is described 
without worrying about details of imple¬ 
mentation. For example, I will not discuss 
locks, semaphores, and monitors that can 
be used to implement the constructs I 
describe. Actual implementations are, and 
must be, somewhat more complicated 
than those described here. 


Taxonomy 

A common myth is that the programmer 
does not need to understand the hardware 
being used. Like all myths, it contains a 
grain of truth. However, when perform¬ 
ance becomes critical, programmers have 
used their knowledge of paging, cache size, 
vector lengths, and so forth to tune their 
programs. 

The situation is worse for parallel 
processors than for uniprocessors because 
of the wider variety of architectures. 
Although a number of efforts are being 
made to write portable programs for par¬ 
allel processors, 7,16 some algorithms will 
run poorly on certain architectures. In 


addition, the coding style used will fre¬ 
quently depend on the type of parallelism 
in the hardware. 

Rather than attempt a complete tax¬ 
onomy, I restrict the classification to those 
aspects that affect coding style. I put all 
systems into one of three classes: shared 
memory, message passing, or hybrid. 
While I don’t expect such a simple scheme 
to describe the variety of machines possi¬ 
ble, it is sufficient to demonstrate the vari¬ 
ety of programming styles used. 

It is common to distinguish between 
processes and processors. A process is a 
running program; a processor is a com¬ 
puter. In many operating systems it is pos¬ 
sible to have many processes running on 
one processor. For the remainder of this 
article, I ignore this distinction and assume 
each processor has only one process. Such 
an assumption is reasonable for scientific 
codes running on a stand-alone parallel 
processor. 

Shared memory. A shared memory 
machine has a single global memory acces¬ 
sible to all processors. The simplest config¬ 
uration is shown in Figure 1. Each 
processor may have some local memory, 
such as registers as on the Cray X-MP, 17 or 
the cache on the IBM 3090, 18 but I assume 
the operating system presents to the user 


the image of totally shared memory. For 
example, although the cache on the IBM 
3090 is local to a processor, cross-cache 
validation makes the cache transparent to 
the user. The user need not worry that a 
piece of data is in the cache of the wrong 
processor, since the operating system 
makes sure that the correct value is deliv¬ 
ered. In fact, the programmer is not 
allowed to explicitly use the cache in any 
way, making the system look like it has a 
single, global memory. 

Other types of shared memory organi¬ 
zations are possible. For example, the Alli- 
ant FX/8 10 has several processors run out 
of a common cache. A completely differ¬ 
ent approach was taken by the NYU Ultra 
machine. 19 Although the memory is dis¬ 
tributed, the system fits my definition of a 
shared memory machine because any 
processor module can be connected to any 
memory module. 

A key feature of shared memory systems 
is that the access time to a piece of data is 
independent of the processor making the 
request. If the code running on two proces¬ 
sors can be swapped without affecting per¬ 
formance, the system has a true, shared 
memory. This is not to say that there is no 
memory contention. Such issues as page 
faults and memory bank conflicts still 
affect performance, just as in uniproces¬ 
sors. Clearly, the aggregate memory band¬ 
width will limit the number of processors 
that can be accommodated on the system. 

Message passing. Message passing sys¬ 
tems are configured so that some memory 
is local to each processor but none is 
globally accessible. The only way for the 
application to share data among proces¬ 
sors is for the programmer to explicitly 
code commands to move data from one 
node to another. The time it takes for a 
processor to access data depends on its dis¬ 
tance from the processor that currently has 
the data in its local memory. Therefore, in 
contrast to the shared memory systems, the 
performance of an algorithm will depend 
on how well the location of data matches 
up with its use. 

Figure 2 shows a fully interconnected 
message passing system, with each proces¬ 
sor having a direct connection to every 
other processor. Such a scheme is imprac¬ 
tical when there are a large number of 
processors. Therefore, designers of mes¬ 
sage passing systems are forced to pick less 
dense wirings. The particular choice made 
has an important influence on the 
algorithms to be run on the machine. An 
algorithm designed for one machine can 
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perform very badly on a different machine. 

A large number of connection schemes 
have been used. 20 The simplest approach 
is to connect the machines in a ring with 
each processor talking to its two nearest 
neighbors. In such a machine, it takes a 
time proportional to the number of proces¬ 
sors to send data to each processor. 

A machine with a denser wiring is a 
mesh machine in which the processors are 
connected in a two-dimensional grid. Each 
processor talks to its north, south, east, 
and west neighbors. If there are p proces¬ 
sors in the machine, it takes time propor¬ 
tional to Vp to send data to each processor. 

An even denser wiring is provided by a 
hypercube interconnection, one of today’s 
most popular designs. 21 ' 23 Each processor 
is assigned a binary id number 
0 < n < 2 d - 1, where d is called the 
dimension of the cube. Two processors are 
connected through a port if their identifi¬ 
cation numbers differ only in the cor¬ 
responding bit in the id number. Thus, 
processor 0000 is connected to processor 
1000 through the left-most communica¬ 
tions port of each processor. The machine 
is called a hypercube because the connec¬ 
tion scheme can be pictured as a cube for 
d = 3. Each node is placed at a corner and 
the direct links become the edges of the 
cube. The advantage of this configuration 
is that the largest distance between proces¬ 
sors is proportional to \ogtf). 

There exists a very simple connection 
system with a maximum distance of two 
from any processor to any other processor: 
a star connection. In such a configuration, 
a central processor connects to every other 
machine. Because of the large amount of 
traffic that the center machine must han¬ 
dle, it often differs from the rest of the 
processors. 24 Although the maximum dis¬ 
tance between processors is independent of 
p, the processor in the middle must be able 
to handle all the traffic generated, which 
limits the number of processors in such a 
system. 

Hybrid. Hybrid systems have some of 
the properties of shared memory systems 
and some of the properties of message 
passing. As illustrated in Figure 3, all mem¬ 
ory is local to a given processor, but the 
operating system makes the machine look 
like it has a single, global memory. Thus, 
programs are written as if for a shared 
memory system. However, data must be 
laid out as if for a message passing system 
if the best performance is to be obtained, 
since the access time depends on the dis¬ 
tance between the owner of the data and 



Figure 3. Schematic of a 
hybrid machine. 


the requester. The IBM RP3, 25 BBN But¬ 
terfly, 26 and Cedar 27 are examples of 
hybrid machines. 

As far as the programmer is concerned, 
hybrid systems are coded like shared mem¬ 
ory systems. Even the bugs in programs are 
the same as for shared memory. However, 
the performance considerations resemble 
those of message passing machines. For¬ 
tunately, the penalties for poor data layout 
are often considerably smaller on hybrid 
systems. Message passing systems typically 
take hundreds to thousands of machine 
cycles to deliver a message, while hybrid 
systems deliver data from a remote mem¬ 
ory in tens of cycles. Even so, data layout 
is key to algorithm performance, and the 
aggregate communications speed is a limit 
on the number of processors that can be 
accommodated. 

Comments on taxonomy. While 
Hockney 28 was careful to distinguish 
architecture from function, I am more 
interested in the programmer’s view of the 
system. Such a view necessarily mixes the 
actual hardware with the picture of the 
machine presented by the available soft¬ 
ware. For example, a group of processors 
sharing a common memory would be clas¬ 
sified as a message passing system if the 
software tools only used the shared mem¬ 
ory for message buffers. However, a clever 
programmer who managed to use these 
buffers to share data would think of the 
machine as a shared memory system. 

My classification scheme also includes 
questions of performance, something 
intentionally left out of other taxonomies. 
In particular, the principal difference 
between a hybrid and a shared memory 
system is one of performance. A program¬ 
mer interested only in correct operation 
can treat a hybrid system as if it were a 
shared memory system. This approach is 
reasonable if the penalty for accessing data 
out of the local memory is small. However, 
the programmer must be aware of the 
details of the hardware implementation in 


order to produce efficient code. Even a fac¬ 
tor of two delay in getting data can seri¬ 
ously degrade performance. 

Software tools 

Parallel processing hardware does the 
programmer no good without a means of 
describing the parallelism to the system. In 
large measure, the type of parallelism 
selected by the programmer depends as 
much on the software tools as on the 
underlying hardware. This section 
describes some of the tools that can be used 
to make a program run on several 
processors. 


Message passing. Message passing sys¬ 
tems need only two basic functions added 
to the standard language support, SEND 
and RECEIVE. Of course, most 
implementations include a variety of mode 
setting and query functions not discussed 
here. 

SEND is used to send a message contain¬ 
ing data from one processor to another. 
There are actually two forms of SEND. 
One continues processing immediately on 
dispatching the message; the other waits to 
make sure the message has arrived (but not 
necessarily been read into the memory of 
the recipient). The latter is called a blocked 
SEND and the former, an unblocked 
SEND. At a minimum, the arguments 
include a destination, the message length, 
and an array containing the message. Other 
arguments often used are a status word, a 
port address or routing information, and 
a flag to indicate whether or not to wait for 
an acknowledgment. 

In general, a blocked SEND is used only 
in the presence of unreliable communica¬ 
tions. For example, if the processors do not 
maintain message queues, a message will 
be rejected until the previous message has 
been received. If it is important that all 
messages be sent in a particular order, a 
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ELSE IF (I.EQ.2) THEN 
ELSE IF (I.EQ.3) THEN 


parallel case 
parallel 
parallel 
end parallel i 


Figure 4. PARALLEL CASE code. 


blocked SEND must be used. (I assume 
that the operating system will continually 
try to transmit a rejected message from a 
blocked SEND.) 

A RECEIVE is used to read a message 
sent from another processor. It can also be 
blocked or unblocked. At a minimum, the 
arguments include an array to contain the 
message and the length of this array. Other 
options frequently used are the processor 
id of the sender or the input port to be read, 
a status flag, and an indication of whether 
or not to send an acknowledgment. 

Both blocked and unblocked modes are 
needed. If the algorithm requires a specific 
piece of data from another processor, the 
programmer codes a blocked RECEIVE. 
The receiver then waits for the data to 
arrive. Unfortunately, blocked RECEIVE 
can lead to a deadlock in which two proces¬ 
sors are waiting for data from each other. 

Unblocked RECEIVE has two uses. The 
most common use is to implement a global 
receive, in which the processor needs to 
receive messages on several input ports but 
the order doesn’t matter. If the port id is an 
argument to the RECEIVE, an unblocked 
RECEIVE allows the program to continu¬ 
ally test the input ports and read the data 
as it arrives. This requirement could be met 
by providing a RECEIVE FROM ANY 
construct. 

The other use for unblocked RECEIVE 
is asynchronous input. Here the program 
continually checks the input port for a 
message. If there is no message, one piece 
of work is done; if there is a message, a 
different piece of work is done. In general, 
such a scheme is useful for load balancing. 
If the message carries data needed by high 
priority work, it is still possible to do low 
priority work until the data arrives. If a 
blocked RECEIVE were used, the proces¬ 
sor would be idle until the message arrived. 

Shared memory. For a number of reasons, 
the language extensions needed by shared 


memory systems are much more extensive 
than those for message passing machines. 
First, there is the need to distinguish which 
data is private to each processor and which 
is known to all processors. Second, because 
data is shared, synchronization is needed 
to prevent out-of-sequence access to mem¬ 
ory. Primarily, though, the shared memory 
allows an entirely different style of pro¬ 
gramming that has several modes of oper¬ 
ation, each requiring a different set of 
language extensions. 

There are two commonly used ways to 
divide up the work in a shared memory sys¬ 
tem. In the fork-join style, a process will 
spawn subprocesses, a FORK, and wait for 
them to finish, a JOIN. In the SPMD (sin¬ 
gle program, multiple data) style, each 
processor runs the same program but exe¬ 
cutes different code depending on its 
processor id or data in shared memory. 

Both fork-join and SPMD programs 
need a means to restrict access to code. A 
critical section contains code that gets 
executed by all processors one at a time. It 
is usually used for reduction operations 
such as summing into a global variable. A 
serial section is code to be executed by only 
one processor and skipped by all others. It 
is usually used to initialize global data. 

Synchronization is also needed. In mes¬ 
sage passing systems a blocked RECEIVE 
synchronizes; in fork-join, the JOIN serves 
this purpose. In SPMD programming, 
other constructs are needed. A barrier is a 
point in the code where all processors wait 
for the last one to arrive. A barrier is dan¬ 
gerous because a processor might branch 
around it, which causes all the other 
processors to wait until the program ter¬ 
minates. Another way of synchronizing is 
the WAIT UNTIL construct. Here each 
processor continually checks shared mem¬ 
ory to see if some particular condition is 
met. In contrast to the barrier where the 
processors wait at a specific line of code, 
each processor can WAIT UNTIL the 


same condition is met from different parts 
of the program. One common use is to wait 
until an input or output operation finishes 
before continuing the computation. 

Probably the most common SPMD con¬ 
struct is the PARALLEL DO. Since the 
cost of sharing data is very small in shared 
memory systems, programmers often 
parallelize their code at the DO-loop level. 
If all iterations of a loop are independent, 
each processor can run a different subset 
of the loop index range as long as each 
value of the index gets used exactly once. 

TVvo mechanisms are used to distribute 
the work in the loop to the processors. A 
self-scheduled PARALLEL DO works by 
giving the first value of the loop index to 
the first processor to arrive, the second 
index to the second processor, and so forth. 
A processor that reaches the end of the 
loop returns to the top to get more work. 
A prescheduled PARALLEL DO works by 
partitioning the loop ahead of time so that 
each processor will do a certain set of loop 
indices, no matter how long each one takes. 

Self-scheduled PARALLEL DO pro¬ 
vides for automatic load balancing, since 
processors get more to do as soon as they 
have finished with their work. However, 
assuring that each processor gets a unique 
value of the loop index forces some syn¬ 
chronization not needed for prescheduled 
loops. 

The choice between self- and presche¬ 
duled loops depends on both the applica¬ 
tion and the hardware. If the work is 
naturally load balanced, and the synchro¬ 
nization cost is high, then prescheduling is 
preferred. If the amount of work depends 
on the loop index or if the synchronization 
cost is low, then self-scheduling will per¬ 
form better. 

Another useful construct is the PARAL¬ 
LEL CASE. Its structure is similar to a 
standard CASE statement available in 
several languages. It differs from the usual 
definition in that all cases get executed in 
parallel. It can be implemented with a 
PARALLEL DO, but PARALLEL CASE 
leads to more readable code as shown in 
Figure 4. 

Simple problem 

The best way to get a feel for what the 
programming tools described in the previ¬ 
ous section imply is to look at a simple 
example, summing a list of 1,000,000 
numbers. Actually, this problem is non¬ 
trivial if the best performance is to be 
achieved, and numerous algorithms have 
been proposed. 12 Since we are interested 
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in coding style, I will discuss only the sim¬ 
plest method. 

Figure 5 shows a program that could be 
run on a uniprocessor. The main program 
reads in the numbers to be summed, calls 
a subroutine to do the arithmetic, and 
prints the result. The subroutine initializes 
the sum, adds the numbers, and returns. 
I have separated the work in this way to 
illustrate several of the coding styles. 

A simple notation has been used in the 
sample codes. Code shown in uppercase is 
standard Fortran; code in lowercase 
represents language extensions. While 
these codes have not been run, a desk 
check has been done. 


PARAMETER ( N = 1000000 ) 
DIMENSION A (N) 

READ (*) A 

CALL SUMSUB (SUM,A,N) 

WRITEOO SUM 

END 

SUBROUTINE SUMSUB (SUM,A,N) 
DIMENSION A ( N ) 

SUM = 0.0 
DO 10 I = 1, N 

SUM = SUM + A(I) 

10 CONTINUE 
RETURN 
END 


Figure 5. Uniprocessor version. 


Shared memory. One of the first things tried 
on a parallel processor was the fork-join 
approach shown in Figure 6a. We can see 
how the constructs introduced in “Soft¬ 
ware tools” are used to get the program 
running on more than one processor. First, 
we declare the shared data GLOBAL. Each 
processor will have a private copy of any 
variable not explicitly declared as shared. 

In order to divide up the work equally, we 
need to know how many processors are 
available. It is common practice to code all 
programs independently of the number of 
processors. This convention not only 
enhances the portability of the code, but 
makes it easier to debug on a single proces¬ 
sor. I assume the system will provide the 
number of processors through the system 
variable NPROCS. 

Work is divided among the processors 
using the CREATE function. Each time a 
CREATE statement is executed, a task is 
generated that runs the indicated subrou¬ 
tine, here SUMSUB. Each task works on a 
different part of the array, as indicated by 
A(IS). It is important that the address of 
A(IS) be computed before the CREATE is 
completed. If we rely on the created task to 
fetch the address of A(IS) from shared 
memory, we may find that the main routine 
has changed it before the created task gets 
started. 

Once the main program has finished dis¬ 
tributing the work, it does the remaining 
part of the job. When the main program 
reaches the JOIN statement, it waits until 
all other tasks have completed, at which 
time the result can be printed. 

All the computational work is done in 
the subroutine SUMSUB. Here we see the 
use of a serial section and a critical section. 

Since SUM is a global variable, we could 
get wrong results if all processors were 
allowed to initialize the variable so we use Figure 6. Fork-join (a) and fork-join with good performance (b). 


PARAMETER C N = 1000000 ) 
global A(N), SUM, N 
READ (*) A 

INC = (N+nprocs-1)/nprocs 
IS = 1 

DO 10 J = 1, nprocs-1 

create ( 'SUMSUB', SUM, A(IS), INC) 
IS = IS + INC 
10 CONTINUE 

CALL SUMSUB ( SUM, A(IS), N-IS+1 ) 
join 

WRITE (*) SUM 
END 

SUBROUTINE SUMSUB < SUM, A, N ) 
DIMENSION A(N) 
serial section 
SUM = 0.0 

end serial section 
DO 10 I = 1, N 
critical section 

SUM = SUM + A C I ) 
end critical section 
10 CONTINUE 
RETURN 


SUBROUTINE SUMSUB (SUM,A,N) 
DIMENSION A(N ) 

SUM = 0.0 

end serial section 
SUMLOC = 0.0 
DO 10 I = 1, N 

SUMLOC = SUMLOC + ACI) 

10 CONTINUE 

critical section 

SUM = SUM + SUMLOC 
end critical section 
RETURN 
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Figure 7. Message passing code for processor 0 (a), message passing code for processors 
other than 0 (b), and message passing for SPMD code (c). 


a serial section. We could have set SUM = 
0 in the main routine, but perhaps we don’t 
want to rely on the user remembering to 
initialize. (The real reason is that I didn’t 
want to pass up the opportunity to illus¬ 
trate a serial section.) 

We need the critical section in the loop 
because SUM is a global variable. If 
processor 1 and processor 2 each fetch the 
old value, add their respective contribu¬ 
tions, and store the new value, the program 


will produce an incorrect result. The criti¬ 
cal section guarantees that only one proces¬ 
sor can update the global variable SUM at 
any instant. 

Even though the code in Figure 6a will 
produce correct results, it suffers from 
what is called a performance bug; it runs 
much slower than it should. The reason is 
the critical section within the loop. Since 
only one processor is allowed to update the 
value at any time, only one addition can be 


done at a time. Therefore, we have lost 
almost all the possible parallelism in the 
code. The code in Figure 6a will run slower 
than the uniprocessor version because of 
the synchronization overhead incurred at 
the critical section. 

The code in Figure 6b, where we sum the 
data into a local variable, will perform 
much better. Since each processor has its 
own copy of SUMLOC, all the arithmetic 
in the loop can be done in parallel. At the 
end of the loop each processor adds its 
contribution to the global sum in a critical 
section. Now the critical section is encoun¬ 
tered only once per processor instead of 
once per addition. 

This routine has a curious kind of side 
effect. Fortran programmers are used to 
calling subroutines and having the values 
of variables in the calling sequence and in 
COMMON changed. Normally though, 
except for knowing the data types of the 
variables in the calling sequence and 
COMMON, the person writing the sub¬ 
routine need not know what is happening 
in the calling routine. Such is not the case 
here. The person writing SUMSUB must 
know that SUM is a global variable. The 
absence of any such indication within the 
subroutine is a serious problem for debug¬ 
ging and maintenance. 

Message passing. The coding style on a 
message passing system is quite different 
from that used on a shared memory system 
for two main reasons. First, data must be 
explicitly moved from the memory of one 
processor to the memory of another. Sec¬ 
ond, there is often no master processor to 
spawn tasks as in Figure 6a. This second 
difference is due in part to the distributed 
memory. For a master processor to create 
tasks, code would have to be physically 
moved from the master processor to each 
slave. The communications cost of such a 
transfer makes this approach impractical. 
Therefore, it is usual to load the code for 
each processor once at the start of the job. 
Figure 7a shows the program that runs on 
processor 0, and Figure 7b shows the code 
running on all other processors. 

Processor 0 reads the data and distrib¬ 
utes it using the SEND command. The 
arguments can be interpreted as saying 

SEND a message to processor J con¬ 
sisting of one word starting at the 

address of INC and INC words 

starting at the address of A(IS). 

Next, processor 0 calls SUMSUB to add up 
the rest of the numbers. After that is done, 
it goes into a loop to get the partial sums 
from the other processors. The RECEIVE 
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PARAMETER ( N = 1000000 ) 
global A(N), SUM, N, JSYNC 
DATA JSYNC/0/ 

IF ( procid .EQ. 0 I THEN 
READ (*) A 
SUM =0.0 
JSYNC = 1 
ENDIF 

INC = (N+nprocs-1 )/nprocs 
IS = INC#procid 
INCR = MIN C INC, N-IS ) 
wait-unti1 ( JSYNC .NE. 0 ) 

CALL SUMSUB (SUM,A(IS + 1),I NCR) 
barrier 

IF ( procid .EQ. 0 ) WRITE (*) SUM 
END 


Figure 8. SPMD code. 


FROM ANY says to read one word from 
any processor into SUMJ. Once all partial 
sums have been received, the total is 
printed. 

All the other processors run the program 
shown in Figure 7b. As soon as the data 
arrives at the communications port of a 
processor waiting at a RECEIVE, it can 
copy the data into its local memory and 
proceed. Each receives a message from 
processor 0 consisting of one word to be 
stored as INC and INC words to be stored 
into array A. Each then calls a version of 
SUMSUB identical to that in Figure 5. On 
return from SUMSUB, each processor 
sends its contribution to processor 0 and 
exits. Since this is the main routine on these 
processors, the job terminates. In order to 
compute more than one sum, the program¬ 
mer would have to code a loop with an 
appropriate termination condition. 

Another way to manage the code in such 
an environment is to program in the single 
program, multiple data style. When coded 
this way, each processor runs the same code 
unless a processor-dependent control is 
used. An example of SPMD programming 
is shown in Figure 7c. 

At the start of the job, each processor is 
in the same state except for a unique iden¬ 
tifier, the PROCID. Each processor first 
computes its own copy of INC. On reach¬ 
ing the first IF statement, all processors but 
number 0 skip immediately to the 
RECEIVE where they wait for data to 
arrive. Processor 0 reads the data and dis¬ 
tributes it using the SEND command. On 
finishing with SUMSUB, each processor 
(other than processor 0) sends its contribu¬ 
tion to processor 0 and exits. Processor 0 
sums the partial sums and prints the result. 

Notice that the main routine is some¬ 
what more complicated than the shared 
memory version. To compensate for this 
complexity, SUMSUB is much simpler 
and independent of the architecture of the 
machine. In addition, there are no prob¬ 
lems of data dependence, and all synchro¬ 
nization is handled by the SEND and 
RECEIVE commands. However, this 
example does not require any global syn¬ 
chronization, which would be more com¬ 
plicated than on a shared memory system. 

Several features of this code affect the 
elapsed time needed to run the program. 
First, there is the cost of communication. 
There will be some overhead in routing the 
data through the interconnection network 
in addition to the time taken to physically 
move the data. This overhead may be a 
small part of the communication cost of 
sending a long list of array values, but it is 


certain to dominate the cost of sending the 
partial sums back to processor 0. Second, 
there is the problem of load balancing. It 
is conceivable that some of the processors 
will have completed their work before the 
last one has received all its data. We say the 
load balancing is poor because some 
processors are idle while useful work 
remains to be done. 

SPMD shared memory. The SPMD cod¬ 
ing style is not limited to message passing 
systems. 6,29,30 Figure 8 shows how SPMD 
could be used on a shared memory system. 
Subroutine SUMSUB is identical to that in 
Figure 6b. 

As with the previous shared memory 
example, some of the data must explicitly 
be declared GLOBAL. As with the mes¬ 
sage passing example, all processors but 
number 0 skip the code that reads the data. 
However, we do not have the SEND- 
RECEIVE mechanism to synchronize the 
processors so we use the WAIT UNTIL. 
The WAIT UNTIL construct is an exam¬ 
ple of event synchronization. In this exam¬ 
ple, each processor continually checks 
global variable JSYNC until it takes on a 
value different from 0. On some systems 
the memory location containing JSYNC 
will be a hot spot. Special hardware is often 
needed to prevent such hot spots from 
degrading system performance. 19,25,31 

While they are waiting for JSYNC to be 
set, all processors compute their own 
copies of the local variables INC, INCR, 
and IS. While these variables could be 
made global and computed in a serial sec¬ 
tion, the overhead in any extra synchroni¬ 
zations and additional memory contention 
would outweigh any possible savings. 

Once JSYNC = 1, the waiting proces¬ 


sors call SUMSUB. Next, we synchronize 
with a barrier to prevent processor 0 from 
writing SUM until all processors have fin¬ 
ished adding their contributions. As soon 
as the last processor reaches the barrier, 
they all continue processing. 


Realistic problem 

While a simple problem like summing a 
series of numbers would appear to be a 
trivial task, we have seen that there are 
some subtle points to consider on a paral¬ 
lel processor. In this section, we will look 
at these points again with a nontrivial 
problem, solving systems of linear 
equations. 

We want to solve the system of equa¬ 
tions A x = b. The best method for 
general, dense matrices is Gaussian elimi¬ 
nation to perform an LU decomposition 
followed by a forward elimination and a 
backward substitution. The procedure is 

(1) Use Gaussian elimination to find 
two triangular matrices L and U such that 
A = LU. 

(2) Solve the lower triangular linear sys¬ 
tem L y = b by forward elimination. 

(3) Solve the upper triangular linear sys¬ 
tem U x = y by back substitution. 

The time needed to factor a matrix of 
order TV into its LU components is propor¬ 
tional to N 3 , while the time for the for¬ 
ward elimination and back substitution 
only increases as N 2 . For matrices with 
N > 100, the LU decomposition accounts 
for over 90 percent of the execution time. 
Therefore, I discuss only the factorization 
step. 


May 1987 


49 







Figure 9. LU decomposition 
by Gaussian elimination. 




< 4 - 


SUBROUTINE FACTOR(A,N,IPVT,INFO) 

DIMENSION A(N » N)»IPVT(N ) 

INFO = 0 

DO 20 K = 1, N-1 

IPVT(K) = ISAMAXCN-K + l , A(K ,K ) , 1 ) + K - 1 
CALL SSWAP (N-K+1,A(K,K),N,A(IPVTCK),K),N) 

IF ( A(K r K) .EQ. 0.0 ) THEN 
INFO = K 
ELSE 

T = - 1.0/A(K > K) 

CALL SSCAL (N-K,T,A(K+1,K),1) 

DO 10 J = K + l, N 

CALL SAXPY (N-K , A(K, J ), A (K + l , K ), 1,A(K+1,J),1) 
10 CONTINUE 

ENDIF 
20 CONTINUE 

IF ( A(N,N> .EQ. 0.0 ) INFO = N 

RETURN 

END 


Figure 10. LU decomposition. 


Figure 9 illustrates the decomposition 
process. The following procedure is used 
for each column in turn: 

(1) Search the elements on or below the 
diagonal of the current column. 

(2) Interchange rows to move the largest 
of these elements to the diagonal. This new 
diagonal element is called the pivot. 

(3) Divide the elements below the 
diagonal by the pivot to produce a set of 
multipliers. 

(4) Multiply the part of the pivot row to 
the right of the diagonal times each mul¬ 
tiplier and subtract the product from the 
corresponding part of each row. 

The code in Figure 10 embodies the LU 
decomposition algorithm. It is nearly iden¬ 
tical to SGEFA from Linpack and calls 
several subroutines from the BLAS. 32 As 
used here they mean 

ISAMAX: Search elements K to N of 
column K and return the index of the 
element having the largest mag¬ 
nitude. 


SSWAP: Interchange elements K to N of 
rows K and IPVT(K). 

SSCAL: Multiply elements K+l to N of 
column K by T. 

SAXPY: Multiply elements K+l to N of 
column K by A(K,J) and add the 
result to the corresponding elements 
of column J. 

The arguments in the calling sequence are 

A: the matrix to be factored; 

N: the order of the matrix; 

IPVT: an array to save the order of inter¬ 
changes needed for pivoting; and 

INFO: a flag that indicates if a pivot is 
identically zero. 

Shared memory. Figure 11a shows the 
FACTOR routine as it might be coded on 
a shared memory system using the fork- 
join approach. Assume that the arrays A 
and IPVT have been declared global by the 
calling routine. There are only two differ¬ 
ences between the sample codes in Figures 
10 and 11a. Figure 11a uses a CREATE 


function instead of a CALL to routine 
SAXPY, and a JOIN to synchronize the 
processes. Nothing else changes. 

This program creates a task for each 
column to be processed. This approach is 
inefficient if there are fewer processors 
than columns due to the unavoidable over¬ 
head in starting the tasks. A better 
approach is shown in Figure lib, where we 
divide the work up equally among the 
available processors. The additional argu¬ 
ment in SAXPY tells each processor the 
number of columns to process. 

Programming this problem SPMD on a 
shared memory system is slightly more 
complicated, as shown in Figure 12a. First, 
we need to assume that A and IPVT have 
both been declared global by the calling 
routine. We need one serial section to find 
the pivot and interchange rows. The barrier 
assures that none of the processors tests 
A(K,K) before the pivot step is complete. 
Next, everyone else waits at the second bar¬ 
rier while a processor scales the subdi¬ 
agonal part of the current column. Finally, 
we have a PARALLEL DO. The WAIT 
UNTIL ensures that the next column has 
been computed before the search for the 
pivot element begins. 

This example illustrates one of the dis¬ 
advantages of SPMD programming. In 
general, synchronization must be done 
before most program branches. Not only 
does this requirement introduce extra over¬ 
head, but it is a source of potential errors. 
In practice, the extra overhead is small 
because most of the processors go immedi¬ 
ately to the synchronization point. In addi¬ 
tion, errors are made that involve skipping 
barriers irt the program. These errors are 
easier to find than most because processors 
end up at a barrier waiting for a processor 
that never shows up. The programmer has 
available the exact location where the 
processors are waiting, which helps in find¬ 
ing the error. 

It is also possible to provide a 
PRESCHEDULED DO. One way of 
prescheduling a loop is shown in Figure 
12b. The PARALLEL DO has been 
replaced by a conventional DO that 
depends on the processor id and the num¬ 
ber of processors. Each processor entering 
the loop is guaranteed to get a unique value 
of J, and each value of J will be taken by 
some processor. 


Message passing. When discussing the 
shared memory implementations of LU 
decomposition, we assumed the data was 
already in shared memory. We cannot 
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SUBROUTINE FACTOR(A,N,IPVT,INFO) 

SUBROUTINE FACTOR(A,N,IPVT,INFO) 

DIMENSION A(N,N), IPVTCN) 

DIMENSION A C N,N ) , IPVTCN) 

INFO = 0 

INFO = 0 

DO 20 K = 1, N-l 

DO 20 K = 1, N-l 

IPVTCK) = ISAMAXCN-K + l ,A(K ,K),1) + K-1 

IPVTCK)= ISAMAX(N-K + 1,A(K,K),1) + K - 1 

CALL SSWAP (N-K+1,A(K,K),N, 

CALL SSWAP (N-K+l,ACK,K),N, 

* A(IPVT(K),K),N ) 

* A(IPVT(K),K)»N) 

IF ( A(K , K ) .EQ. 0.0 ) THEN 

IF ( A(K,K) .EQ. 0.0 ) THEN 

INFO = K 

INFO = K 

ELSE 

ELSE 

T = -1.0/A(K,K) 

T = -1.0/A(K,K) 

CALL SSCAL CN-K,T,A(K +1 , K ) , 1 ) 

CALL SSCAL CN-K,T,A(K+1,K),1) 

DO 10 J = K+l, N 

NC = (N-K+nprocs-1)/nprocs 

create('SAXPY',N-K , A(K , J ) , 

J = 1 

* ACK+l,K),l,ACK+l,J),l) 

DO 10 I = 1, nprocs 

10 CONTINUE 

NC = MIN C NC, N-K-J+l ) 


create('SAXPY’,NC,N-K,A(K , J ) , 

ENDIF 

* ACK+l,K),l,ACK+l,J),l) 

20 CONTINUE 

J = J + NC 

IF ( A(N,N) .EQ. 0.0 ) INFO = N 

10 CONTINUE 

RETURN 

join 

END 

ENDIF 

20 CONTINUE 

IF ( A(N,N) .EQ 0.0 ) INFO = N 

RETURN 

(a) 

( b > END 


Figure 11. Fork-join LU, one processor per column (a) and fork-join LU, several columns per process (b). 


ignore the step of getting the data into 
memory on a message passing system. Fig¬ 
ure 13a shows one way of distributing the 
data to the contributing processors. 

We have used the trick of having proces¬ 
sor 0 send data to itself. Notice that the 
columns of the matrix are distributed to 
the processors much as one would deal 
cards in a bridge game: processor 0 gets 
columns l,p+l, 2p+t,...; processor 1 gets 
columns 2,p+2,2p+2 ,...; and so forth. If 
instead we gave processor 0 the first group 
of columns, processor 1 the next set, and 
so forth, we would get very poor load 
balancing. Referring to Figure 9 shows that 
after the first group of columns had been 
reduced, processor 0 would have no more 
work to do. Eventually, only one processor 
would be doing all the work. 

In Figure 13b we see how the work is dis¬ 
tributed. We have included additional 
arguments in the calling sequence: M, the 
number of columns to be held by each 
processor, and W, a work array used to 
hold the pivot column sent from another 
processor. On entering the DO 20 loop in 
routine FACTOR, the processor that owns 
the current pivot column finds the index of 
the maximum and scales the column. It 
then broadcasts the index of the pivot and 
the multipliers to all other processors. As 
soon as one of the other processors receives Figure 12. SPMD LU decomposition (a) and SPMD LU decomposition, prescheduled DO (b). 


SUBROUTINE FACTOR(A,N,IPVT,INFO) 

DIMENSION ACN,N), IPVT(N) 
global JSYNC 
INFO = 0 

DO 20 K = 1, N-l 
serial section 

IPVTCK) = ISAMA’XCN-K + 1 , A(K ,K ) , 1 ) + K - 1 
CALL SSWAP (N-K+1,A(K,K),N,A(IPVT(K),K),N) 
end serial section 
barrier 

IF ( A(K , K ) .EQ. 0.0 ) THEN 
INFO = K 
ELSE 

serial section 

T = -1.0/A(K,K ) 

CALL SSCAL (N-K,T,A(K+1,K),1) 

JSYNC = 0 

end serial section 
barrier 

parallel do 10 J = K+l, N 

CALL SAXPY (N-K,A(K,J),ACK + l,K ) ,1 , ACK + l,J ) , 1 ) 
JSYNC = J 
10 CONTINUE 

wait until ( JSYNC .GE. K+l ) 

ENDIF 
20 CONTINUE 

IF ( A(N,N) .EQ. 0.0 ) INFO = N 
RETURN 

(a) END 


DO 10 J = K+l+procid, N, nprocs 

CALL SAXPY (N-K,A(K,J),ACK + l,K ) , 1 ,ACK + l,J),l) 
CONTINUE 
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its data, it does the row interchange and 
forms the appropriate linear combinations 
for the columns it owns. 

There is no need for a barrier at the end 
of the loop. Any processor that finishes 


early will quickly find itself at the 
RECEIVE waiting for the new pivot 
column to send the required data. If the 
owner of the pivot column finishes first, it 
immediately starts work on finding the 


pivot and producing the multipliers. The 
RECEIVE command guarantees the 
needed synchronization. 

Hybrid systems. Hybrid systems are 
programmed like shared memory systems, 
but have data access delays like message 
passing systems. These delays are usually 
much smaller than on message passing sys¬ 
tems, but they can be significant. For exam¬ 
ple, the design of the IBM RP3 25 calls for 
local data to be delivered in two machine 
cyles while remote data will take 10 
machine cycles to reach the functional unit. 

Although the code in Figure 12a could 
be used on a hybrid system, there is a 
potential performance bug. Assume the 
data is distributed among the processors as 
done for the message passing system. A 
single processor will own the column con¬ 
taining the pivot and the multipliers. As 
coded, the first processor to reach the serial 
section will search for the pivot, and 
another will form the multipliers. If the 
processor performing these tasks is not the 
owner of the data, the program will run 
slower than it should. 

The hybrid system may not need to 
explicitly move the data if the operating 
system can be directed to map the data as 
needed. One way to achieve this end with¬ 
out direct support from the operating sys¬ 
tem or the compiler is shown in Figure 14. 
Here we assume the data is read into the 
memory local to processor 0 and that A is 
declared global in the calling routine. 

The matrix gets distributed to the other 
processors by being copied into a local vari¬ 
able, AL. 1\vo work arrays, one local and 
one global, are used to move the multipliers 
between the local memories of the proces¬ 
sors. First, the owner of the current column 
copies the multipliers from its private 
memory, AL, into the global work array W. 
After the barrier, each processor copies the 
global work array into its local work array, 
WL. This approach saves time since the 
multipliers are used many times in the 
DO 20 loop. At the end of the factoriza¬ 
tion, the entire matrix gets reassembled in 
the DO 40 loop. An undesirable side effect 
of this approach is that memory is wasted 
because two copies of the array are needed. 

Clearly, this code for the hybrid system 
is more complex and harder to debug than 
either the shared memory or message pass¬ 
ing versions. Note, however, that this com¬ 
plication is needed only to improve 
performance. The program in Figure 12a 
will run correctly on a hybrid system. If the 
ratio of the access times for local and 


PARAMETER ( N=1000, M=(N+nprocs-1)/nprocs ) 
DIMENSION A(N,M), IPVT(N), COL(N), W(N), IP(N) 
IF ( procid .EQ. 0 ) THEN 
DO 10 I = 0, N-l 
READ (*) COL 
ICOL = M0D(I,nprocs) 
send ( ICOL, 1 COL t N' ) 

10 CONTINUE 
ENDIF 

DO 20 K = 1, M 

receive ( 0, 'A(1,K):N' ) 

20 CONTINUE 

CALL FACTOR ( A, N, M, IP, W, IN ) 

IF ( procid .EQ. 0 ) THEN 
DO 40 K = 0, nprocs-1 

receive ( K, 'INFOsl, IP:N’ ) 

IN = MAX (IN,INFO) 

DO 30 J = 0, N-I 

IF (MOD(J,nprocs).EQ.K) IPVT(J ) = IP(J) 
30 CONTINUE 

40 CONTINUE 
ENDIF 
RETURN 
END 


(a) 

SUBROUTINE FACTOR ( A, N, M, IP, W, IN ) 

DIMENSION A(N,M), IP(N), WCN) 

IN = 0 
IC = 1 

DO 20 K = 1, N-l 

IT = MODCK-l.nprocs ) 

IF ( procid .EQ. IT ) THEN 

L = ISAMAXCN-K+1,A(K,IC),1) + K - 1 
IF ( A(L,IC) .EQ. 0.0 ) THEN 
IN = K 
ELSE 

T = -1.0/A(L,IC) 

CALL SSCAL (N-K,T,A(K +1,1C) , 1) 

ENDIF 

send (*, 'IN:1 , L:l, A(K +1,IC):N-K' ) 

ENDIF 

receive ( IT, 'IN:1, Lsl, W(K+1):N-K' ) 

IP(K) = L 

CALL SSWAP (M-IC+1,A(K,IC),N,A(L,IC),N) 

IF ( procid .EQ. IT ) IC = IC + 1 
IF ( IN .NE. K ) THEN 
DO 10 J = IC, M 

CALL SAXPY (N-K,A(K,J ) ,W(K+1 ) ,1,A(K + 1,J ) ,1) 
10 CONTINUE 

ENDIF 
20 CONTINUE 

IT = M0D(N-1,nprocs ) 

IF ( A(N,M) .EQ. 0.0 .AND. procid .EQ. IT ) IN = N 
send ( 0, 'IN:1, IP:N' ) 

RETURN 

END 

(b) 


Figure 13. Message p 


g data distribution (a) and message passing computation (b). 
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remote data is near unity, there will be no 
need to worry about the location of the 
data. 


Published examples 

Solving systems of linear equations is an 
important part of many applications and 
often accounts for a large part of the com¬ 
puter time. Because the algorithm can be 
written in a very compact program, it is 
commonly used as an example. 33,34 

This section presents four versions of 
this algorithm that illustrate some of the 
parallel processing constructs used. These 
programs are based on subroutine SGEFA 
from Unpack. 32 Liberties were taken with 
the published codes to make them more 
readable and more like my examples. The 
reader is referred to the cited publications 
for the full programs. 

VM/EPEX. I wrote the first example, 
Figure 15, to show SPMD programming 
using the VM/EPEX software available 
for use within IBM for experimental pur¬ 
poses. 6 VM/EPEX works with the 
VM/CMS operating system using 
unprotected shared segments to provide a 
shared memory area. Each processor is a 
distinct virtual machine, and all synchro¬ 
nization is handled by semaphores in 
shared memory. 

Although the hardware can have one, 
two, or four processors, the software makes 
it possible to simulate any number of 
processors. VM/EPEX comes with a 
preprocessor that scans a source file for 
constructs beginning with @. These con¬ 
structs get converted into in-line code and 
calls to subroutines that provide the 
required functions. 

The VM/EPEX code is very similar to 
the code in Figure 12a. The @SHARED in 
the example is translated into COMMON 
and marked for loading into the shared 
memory. The construct is equivalent to 
declaring the variables A, IPVT, and N to 
be GLOBAL. @SERIAL BEGIN PRO¬ 
CESS = 1 defines a serial section to be 
executed by the processor with PROCID = 
1. It is terminated with @ SERIAL 
END WAIT, which is equivalent to the 
END SERIAL SECTION followed by a 
barrier. The @DO is a self-scheduled, 
PARALLEL DO terminated by the 
@ENDDO WAIT. Again, the WAIT is 
equivalent to putting a barrier following 
the loop. The option CHUNK is used to 
reduce the overhead in assigning loop indi- 
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ces to processors. Here it tells the system to 
give each processor 10 values of the loop 
index each time they get work. Because of 
the synchronization needed to get unique 
loop indices, the reduction in elapsed time 
is significant. 


The Force. This example, Figure 16, 
shows the SPMD programming style using 
the Force. 2 The basic concept of the Force 
is very similar to that of VM/EPEX 
although they were developed indepen¬ 
dently. I have modified the published code 
to make it closer to the examples shown 
earlier. As published, the program does 
both the search for the pivot element and 
the scaling of the pivot column in parallel. 
Since these steps represent only a small part 
of the work if the matrix is large, I have 
chosen to do them on a single processor. 
A Force subroutine has a header that 
Figure 16. Force code. indicates how many processes, NPROC, 

are to be run on how many processors, 
NPEM, and the local variable used to store 
the process id, ME. The header also con¬ 
tains the declaration of global data. Like 
VM/EPEX, COMMON is used to imple¬ 
ment data sharing. 

The Force uses a generalized concept of 
a barrier. All processes stop at the barrier 
until the last one has arrived. That process 
then executes the code up to the END 
BARRIER. Once the first process has 
reached this point, all processes continue 
executing at the line following the END 
BARRIER. Thus, this construct is equiva¬ 
lent to our serial section bracketed by our 
BARRIER. If there is no code between 
BARRIER and END BARRIER, then this 
construct is equivalent to our BARRIER. 
The SELFSCHED DO is equivalent to our 
PARALLEL DO. 

Protran. A collection of problems has 
been put together to test language exten¬ 
sions to support vector and parallel proces¬ 
sors. 35 One of these extensions to Fortran 
was developed by IMSL, Inc., and is called 
Protran. 36 In addition to vector and 
matrix operations, it has a number of prob¬ 
lem solving statements. Extensions include 
some that make the language look more 
like the proposed Fortran 8X array 
extensions 37 and others that add parallel 
processing constructs. 

The parallel processing constructs added 
to Protran are DO PARALLEL, CRITI¬ 
CAL, and BEGIN-END. The first two 
extensions do not need further explana¬ 
tion. Since this language is intended for use 
on shared memory systems, the BEGIN- 


PARAMETER CND=100) 

DIMENSION ACND,ND), TEMP(ND) 

READ (5,*) N 
CALL SETRNGC A,N,N) 

CALL SETRNGC TEMP,N) 

DO 20 IC = 1, N-1 

CMAX = MAX(A(ICiNrlC)) 

IMAX = L0CMAX(A(IC : N ,IC) ) 

TEMP = A(IMAX,!) 

ACIMAX, i ) = ACIC, s ) 

ACIC,! ) = TEMP 
do parallel CJR=IC+1,N) 
begin 

A CIC,JR) = A CIC,JR)/CMAX 
do parallel CK = IC+1:N) 

A C JR , K ) = A C JR,K ) - ACIC,K)*ACJR,IC) 
end parallel 

end 

end parallel 
20 CONTINUE 
END 


Figure 18. Protran code. 


REAL S 

do parallel C I = 1, N ) 
begin 

REAL T 

end 

end parallel 


Figure 17. BEGIN-END code. 


forcesub SGEFACLDA,INFO) of NPROC on NPEM ident ME 
global AC 1000,1000 ) , IPVTC 1000), N, INFO 
end header 
INFO = 0 

DO 50 K = 1, N-l 

L = ISAMAX C N-K + l, A C K»K) , 1 ) + K - 1 
end barrier 

IF C A C L, K) .EQ. 0.0 ) THEN 
INFO = K 
ELSE 

barrier 

IPVTCK) = L 

CALL SSWAP CN-K+1,ACK,K),N,ACIPVTCK),K),N) 

T = -1.0/ACK,K) 

CALL SSCAL CN-K,T,ACK +1,K) , 1) 
end barrier 

selfsched DO 60 J = K+l, N 

CALL SAXPY CN-K,ACK,J),ACK+1,K),1,ACK+1,J),1) 
60 end selfsched DO 

ENDIF 
50 CONTINUE 
END 
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SUBROUTINE PGEFA(A,LDA,N,M,IP,CID,ID,IPVT,BUF) 
DIMENSION ACLDA.M), BUF(N), IPVT(M) 

L = 1 

DO 20 K = 1, N 

IR = MOD(K-l,IP) 

IF ( IR .EQ. ID ) THEN 

KP = ISAMAX ( N-K+l, A(K,L), 1 ) + K + 1 

IF ( A(KP,L) .EQ. 0.DO ) KP = 0 
IPVT(L) = KP 
IF( KP .NE. 0 ) THEN 
T = -1.0/A( K , L ) 

CALL SSCAL ( N-K , T, A(K+1,L) , 1 ) 

ENDIF 

CALL SCOPYIN-K , A(K + 1,L)>1, BUF,I) 

BUFCN-K+l) = KP 
L = L + 1 
ENDIF 

CALL GSENDCID,IR,K,IP,CID,BUF,N-K+1) 

KP = BUF(N-K+l) 

IF ( KP .NE. 0 ) THEN 

CALL SSWAP ( M-L + l> A(I PVT(K ) ,L ) ,N,A(K,L),N ) 
DO 10 J = L, M 

CALL SAXPYCN-K,A(K,L), BUF,1,A(K+1,J),1) 

10 CONTINUE 

ENDIF 
20 CONTINUE 
END 


Figure 19. iPSC code. 


END is the only way of providing private 
data among processors operating within a 
DO PARALLEL. Although the sample 
code presented does not need any local 
variables, if needed they can be declared 
within the scope of the BEGIN-END. In 
the example of Figure 17, each processor 
executing the DO PARALLEL has the 
same value of S and a distinct value of T. 
In addition, T does not exist outside the 
BEGIN-END. 

In the Protran version of the factoriza¬ 
tion code shown in Figure 18, routine 
SETRNG is used to set up a dope vector 
containing the lengths of the arrays being 
used. These lengths need not be the same 
as the dimensions of the arrays. This code 
also shows several of the array extensions 
proposed for Fortran 8X. 

Note the use of a nested DO PARAL¬ 
LEL. It is left to the compiler to decide 
what work can be done in parallel. For 
example, the inner DO PARALLEL can¬ 
not be started until the column scaling is 
complete. Depending on the hardware, the 
scaling could be done on one processor or 
distributed over processors. In this case, we 
let the compiler contain the knowledge of 
the hardware instead of the programmer. 
Such a procedure puts more pressure on 
compiler writers, but leads to more porta¬ 
ble code. 

Hypercube. The code presented in Fig¬ 
ure 19 is part of the Linpack library being 
written for the iPSC, 21 a hypercube mes¬ 
sage passing system marketed by Intel Cor¬ 
poration. It is quite similar to the program 
shown in Figure 13b as subroutine FAC¬ 
TOR. Although the distribution of the 
data is not shown, a scheme similar to that 
shown in Figure 13a could be used. 

Each node of the hypercube runs the 
same subroutine, so this code is an exam¬ 
ple of SPMD programming. In this system, 
both the number of processors, IP, and the 
process id, ID, are passed through the argu¬ 
ment list. The GSEND routine is similar to 
the SEND(*,... construct used in Figure 
13b. GSEND also contains code to receive 
the data. Thus, all data routing and syn¬ 
chronization is handled in this routine. If 
the program were moved to a different mes¬ 
sage passing system, all that need be 
changed is GSEND. 


T he point of parallel processing is to 
reduce the elapsed time to com¬ 
plete the job. If the program that 


sums 1,000,000 numbers were run on a 
1-Mflop uniprocessor, it would finish in 
about one second. The time the program 
takes on a parallel processor will depend on 
the coding style, the architecture of the 
machine, and the hardware implementa¬ 
tion. However, run on 10 processors each 
capable of 1 Mflop, the job will certainly 
take longer than 0.1 seconds to complete. 
The job of the system designers, compiler 
and library writers, and application 
programmers is to get the actual time for the 
job as close as possible to the ideal. 

I have identified three classes of parallel 
architectures: shared memory, message 
passing, and hybrid. I have also illustrated 
two programming styles: fork-join and 
SPMD. Each programming style can be 
used on each of the parallel architectures 
depending on the tools provided to the 
applications programmer. 

Algorithms are easy to design for shared 
memory systems. One simply prits the data 
in memory as if running on a uniprocessor. 
On the other hand, programs are hard to 
debug. An error usually involves picking up 
wrong data from a global variable. The 
processor then continues computing, with 
the bad data producing an erroneous final 
result. There is no indication of when the 


error occurred. 

Message passing systems are different. 
Algorithm design is hard because the data 
must be distributed so that communications 
traffic is minimized. Debugging is easier 
than in shared memory systems because 
errors normally cause the system to stop at 
the point of the error. Thus, the program¬ 
mer knows the complete state of the 
machine at the point where the error 
occurred. This is not to say that debugging 
is easy. There are many times where it is very 
hard to track down the cause of the error, 
but at least the programmer knows where 
to start looking. 

Hybrid systems are the worst of both 
worlds. Errors are hard to find because they 
are the same ones made on shared memory 
systems. In addition, care is needed in 
organizing the data in order to reduce the 
amount of data to be moved. All is not lost, 
however, since hybrid systems may be eas¬ 
ier to build than either shared memory or 
message passing systems. 

From these comments we see that data 
organization is the key to parallel algorithms 
even on shared memory systems. Unfor¬ 
tunately, Fortran programmers have such 
simple data structures available to them, 
just scalars and arrays, that we tend to con- 
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centrate on program flow. It will take some 
retraining to get Fortran programmers to 
plan their data first and their program flow 
later. 

The importance of data management is 
also a problem for people writing automatic 
parallelization compilers. To date, our com¬ 
piler technology has been directed toward 
optimizing control flow. Such features as 
common expression elimination, code 
movement, and dependence analysis for 
vectorization have been used for many 
years. Even today, when hierarchical mem¬ 
ories make program performance a func¬ 
tion of data organization, no compiler in 
existence changes the data addresses speci¬ 
fied by the programmer to improve per¬ 
formance. If such compilers are to be 
successful, particularly on message passing 
and hybrid systems, a new kind of analysis 
will have to be developed. This analysis will 
have to match the data structures to the 
executable code in order to minimize mem¬ 
ory traffic. 

A more fundamental problem on shared 
memory systems arises when the parallelism 
is nested. Most parallel processors allow 
only a single level of parallelism, which 
greatly simplifies the data sharing specifi¬ 
cation. In these systems, a master process is 
allowed to spawn subprocesses, but the sub¬ 
processes may not themselves spawn 
processes. Alternatively, all processors run 
the same code as their peers. In both cases, 
data is either known to all processes or is 
private. 

The situation is more complicated when 
a subprocess is allowed to spawn sub¬ 
processes of its own. Consider a parallel 
job currently running two subprocesses 
that each call a library routine to work on 
some private data. If the library routine 
spawns subprocesses of its own, a simple 
global/local dichotomy for the data will 
not suffice. Each of the library routine 
invocations must share a different set of 
data among its subprocesses. Clearly, some 
form of scoping of the global data is 
required. The data scoping rules in such 
block-structured languages as Algol and 
PL/I provide a useful model, but to date 
no systems have addressed this issue. 

As I stated in the introduction, my goal 
was to present the state of the art of paral¬ 
lel programming. I believe I have shown 
what a sorry state that art is in. We are just 
beginning to define the appropriate set of 
language extensions. Much more work is 
needed in compiler-assisted dependence 
analysis and in developing debugging 
tools. □ 
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System Architecture of a 
Gallium Arsenide 
One-Gigahertz Digital IC Tester 


Douglas J. Fouts, John M. Johnson, Steven E. Butner, and Stephen I. Long 
University of California, Santa Barbara 


A team from UCSB 
has an approach for 
testing full-custom 
GaAs ICs. The tester 
they’ve developed is a 
hybrid of GaAs digital 
ICs, high-speed silicon 
logic, and a standard 
microprocessor. 


T he technology to support full- 
custom GaAs integrated circuits 
is now emerging. It is still an 
evolving technology with few standards, 
relatively low density, and poor yield. The 
promise of GaAs is its speed; the challenge 
is in harnessing that speed. 


Why a tester? 

This article describes the architecture of 
an early GaAs system, a 1-GHz digital 
integrated circuit tester. The motivation 
for this project was the development of a 
graduate-level course in GaAs integrated 
circuit design at the University of Califor¬ 
nia, Santa Barbara (refer to the sidebar 
description of the course on page 60). 
UCSB’s course produces a multiproject 
GaAs chip every year (Figure 1), and the 
tester is a tool for evaluation of the 
projects after dicing and packaging. The 
multiproject chips are also the source of 
custom GaAs ICs that will be used to build 
the high-speed portions of the IC tester. 

The tester serves a twofold purpose. 
First, very high speed test equipment is 
needed for exercising and evaluating cus¬ 
tom GaAs chips and subsystems. Since the 
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speed of these chips and subsystems far 
exceeds the capabilities of existing digital 
test equipment, there is little choice but to 
make the required tool. Secondly, the tester 
serves as a focus for the first round of 
designs. It is during this round that stan¬ 
dards are set: pads and pad drivers are 
created, voltage swings are established, 
interconnect impedances and packaging 
standards are set, and clocking techniques 
formulated. The first project undertaken— 
in this case a ‘ ‘bootstrap’ ’ project—sets the 
design style for more complex systems yet 
to come. Because of the complexity of 
designing GaAs systems, the design team 
chose not to build a computer as the first 
project. Rather, the high-speed functional 
tester project was selected. Once the tester 
project is accomplished, the team has plans 
to apply what has been learned toward the 
design of a very high speed GaAs computer 
system. 

The goals of the UCSB tester project are 
as follows: 

(1) The tester should be able to test chips 
with a clock rate of up to 1 GHz. 

(2) It should be versatile enough to test 
GaAs unbuffered FET logic, buffered FET 
logic, Schottky diode FET logic, direct- 
coupled FET logic, as well as silicon 
emitter-coupled logic. 
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(3) The system should be easy to recon¬ 
figure in order to test a chip with a differ¬ 
ent pin-out, or to run a different test on the 
same chip. 

(4) It must be possible to specify an 
arbitrary set of test vectors (subject to a 
reasonable size limitation) for use as a set 
of stimuli to the unit under test. The sys¬ 
tem must be able to apply digital signals 
from that set in either a single-shot or con¬ 
tinuous (looping) mode at full test speed. 

(5) The tester should be able to capture 
and store the response of the unit under test 
on a per-pin basis at full test speed. 

(6) The speed of application of the test 
should be adjustable with high resolution 
over a wide range, with particular empha¬ 
sis on the region near the highest speed. 

Some test methods. There are several 
methods that can be used for testing digi¬ 
tal integrated circuits. As the density avail¬ 
able in conventional MOS integrated 
systems has risen, so has the trend toward 
using built-in circuitry to perform self-test. 
This approach minimizes the need for spe¬ 
cial test gear—the chip or subsystem sim¬ 
ply tests itself. Such an approach generally 
incurs a reasonably high overhead that, 
given the present density and yield of cus¬ 
tom GaAs chips, is not justified. In special 


cases, however, where the amount of addi¬ 
tional circuitry is small, the built-in test 
approach is the best approach. 

Falling short of complete built-in self¬ 
test, but still better than zero designed-in 
testability, is the technique of using generic 
structures (usually programmable logic 
arrays, or PLAs) that can be tested with¬ 
out regard for the function they realize. 
Gallium arsenide densities and yields are 
not yet ready to support large PLAs, and 
thus such an approach cannot currently be 
used. 

Since built-in test circuits often are not 
practical, special-purpose external test 
equipment is required. Parametric 
testing—the measurement of ac and dc 
parameters of the circuits within the 
chip—can be done for the most part at low 
speeds and with existing general-purpose 
laboratory equipment. Digital testing at 
device speeds (which are running well into 
the gigahertz range), however, is impossi¬ 
ble with any known test equipment. Thus 
the choice of a functional high-speed, 
general-purpose digital test stand as a pro¬ 
ject not only serves to focus our initial 
design effort, but also fills a void in the 
spectrum of test equipment available for 
general use. 

A number of lower-speed functional 


testers are currently available. 1-4 High¬ 
speed testers, such as the one described in 
this article, are very similar to their lower- 
speed counterparts mentioned above. 
High-speed testers, however, are exposed to 
unique problems that must be solved at the 
system or architectural level. Among these 
problems are 

• interconnection of signals with short 
rise and fall times, 

• skew and distribution of high-speed 
control signals and clocks, 

• design of custom high-speed com¬ 
ponents, 

• selection and screening of high-speed 
components, 

• physical placement of high-speed 
parts, 

• distribution and bypassing of power, 

• ground distribution, and 

• dissipation of heat. 

Test assumptions and conditions. The 

tester we have designed is intended to fill 
a need at the extreme speed range of the 
testing spectrum. Not all chip technologies 
will be testable. The types of chips targeted 
for support in our prototype test stand are 
those that are likely to operate at or above 
200 MHz, and those having ECL- 
compatible external logic levels and noise 
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margins. These assumptions allow most 
GaAs logic families as well as silicon ECL 
to be tested. It will be necessary to package 
all chips before testing them at high speed. 
Probe-type testing can be accomplished 
before dicing. We use microwave probe 
cards 5 to weed out nonfunctional chips 
and to get partial performance evaluation 
data before packaging. 

Standardization of logic levels for GaAs 
logic families has not yet occurred. To 
maintain compatibility with existing high¬ 
speed silicon logic, F100K series ECL logic 
levels and noise margins have been chosen 
as standard. 6 

• Koh(min) = -1.025 volts 

Koh(max) = —0.88 volts 

• Kol(min) = -1.810 volts 


Kol(max) = —i.620 volts 

Hh(min) = -1.165 volts 

Kih(max) = -0.88 volts 

Til(min) = -1.810 volts 

Fil(max) = -1.475 volts 


All our custom GaAs unbuffered FET 
logic, buffered FET logic, and Schottky 
diode FET logic circuits are equipped with 
pad drivers and pad receivers that convert 
to these external specifications. Even if a 
chip does not have compatible external 
logic levels, it can still be tested by the tester 
if its external logic swing is between 0.2 and 
1.0 V. This is possible because the power for 
the chip under test comes from different 
supplies than does the power for the tester. 
Testing a chip with incompatible logic 
levels is accomplished by skewing the 


The GaAs IC design course at UC-Santa Barbara 


Over the past five years, silicon 
VLSI design courses have become 
quite prevalent at many universities. 
These courses present a structured 
design approach that when used 
together with CAD layout and simula¬ 
tion tools result in project circuits 
that are in many cases fabricated by 
the DARPA-funded silicon VLSI pro¬ 
ject (“MOSIS”), and later returned to 
the university to be tested by the 
students. 

A GaAs IC design course is being 
taught at UCSB that builds on this, 
by now, established format. The 
objective of this course is to present 
an introduction to GaAs devices and 
their application in digital ICs suffi¬ 
cient for the student to complete a 
design project of SSI/MSI complexity 
(up to about 200 gates). Because of 
the emphasis on high-speed circuits, 
the UCSB course tends to stress 
understanding of the GaAs devices 
and their interactions with intercon¬ 
nections, load capacitance, trans¬ 
mission lines, and packages to a 
greater degree than is usual for a 
VLSI structured design course. Also, 
because of the logic voltage swing 
limitations, more emphasis is placed 
on gate level analysis of noise mar¬ 
gins and speed. A modified version 
of Spice2 equipped with a GaAs 
MESFET model is used in homework 
assignments and for the design pro¬ 
ject. Recently, Spice3 has been used 
to support this modeling work. Other 
CAD tools that have been adapted for 
GaAs include Caesar (a tool for creat¬ 


ing and editing IC mask structures), 
Lyra (a design rule checking program 
compatible with Caesar), and, more 
recently, Magic (a technology- 
independent VLSI CAD system). Magic 
supports both node extraction and 
design rule checking. All the CAD 
programs (originally from UC- 
Berkeley) were modified to be com¬ 
patible with the GaAs design rules 
and device characteristics. 

Fabrication of the class projects is 
being made possible by DARPA con¬ 
tracts that have been awarded to 
establish GaAs foundries at Rock¬ 
well and Honeywell. Funding was 
provided to UCSB by DARPA through 
NASA and JPL to support fabrication 
of the circuits in the foundry, to 
develop documentation for GaAs IC 
design and fabrication through foun¬ 
dries, and to support the tester pro¬ 
ject described here. The composition 
of the multiproject chips has been 
more constrained than is typical for 
silicon VLSI multiproject chips 
because of the direct-step-on-wafer 
projection lithography required for 
the 1-micron GaAs process. Rather 
than populating the entire wafer with 
different projects, an 8-mm field 
must contain all chips, which then 
are repeated 44 times on the 3-inch 
GaAs wafer. Therefore, a total of 12 
projects were included on the first 
mask set. In addition to the UCSB cir¬ 
cuits on the first set of project chips, 
circuits designed by JPL and Kent 
Smith of the University of Utah have 
also been included. 


power supply voltages to the chip under 
test such that the logic swing of the chip 
under test crosses the logic threshold of the 
tester. 

Because there is little or no theoretical 
work available on fault models for GaAs 
subsystems, we have chosen to support 
functional testing. This decision is based 
on maximizing flexibility and test speed so 
that virtually any fault model and 
associated testing philosophy can be sup¬ 
ported once theoretical results for GaAs do 
become available. Functional testing, sim¬ 
ply speaking, is exercising the circuit under 
test in a manner similar to its normal 
intended operation. Functional testing 
stands apart from other approaches, which 
generally exercise the circuit under test only 
so as to cause detection of possible faults 
from a known class, for example, the sin¬ 
gle or multiple line stuck-at faults. 7 This 
type of testing—which we shall call fault- 
model-based testing—is usually not per¬ 
formed at high speeds, except to minimize 
overall test fixture occupancy. It is also the 
case that the test vectors often do not rep¬ 
resent a “real” environment such as would 
be seen by the chip under test when it is 
placed in the application system. 

The hardware required to support func¬ 
tional testing is sufficient to support fault- 
model-based testing as well. The capabil¬ 
ity to stimulate any pin of the circuit under 
test with an arbitrary predefined pattern at 
full speed must exist. All such input pins 
(from the point of view of the circuit under 
test) must be synchronized at the test fix¬ 
ture. In addition, the test stand must have 
the capability to sample and store the out¬ 
put of any pin of the circuit under test. Not 
all pins in today’s digital systems are purely 
input or output. Bidirectional, tristate pins 
are typical. Because of the extreme speed 
capabilities and corresponding environ¬ 
mental requirements of GaAs chips and 
the significant performance penalty and 
packaging complexity incurred by using 
buses and tristate devices, the use of such 
devices in GaAs subsystems is not thought 
to be advantageous. The prototype test 
stand will not provide support for bidirec¬ 
tional pins of the circuit under test. 


Architectural problems 
unique to GaAs digital 
systems 

The architectures of most contemporary 
computers and digital systems are 
influenced more by function and applica- 
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tion than by implementation. If this 
approach is taken with a GaAs system, the 
result will quite probably be a less-than- 
optimal design because GaAs digital sys¬ 
tems can suffer from several technology- 
related problems. These problems are only 
partially solvable during the implementa¬ 
tion phase of a project. Therefore, if a 
design is to be efficiently implemented in 
GaAs, these technology-related problems 
must be taken into consideration during 
the architectural design phase of the 
system. 

One of the biggest technology-related 
problems that affects the architectural 
design of a GaAs computer or digital sys¬ 
tem is caused by fast signal transitions 
propagating from chip to chip. Spice 
simulations 8 ' 9 of the GaAs digital ICs 
designed for this tester indicate that the 
chips will generate output signals with rise 
and fall times in the vicinity of 75 to 150 
ps. These subnanosecond square waves 
will have components that are many mul¬ 
tiples of the fundamental frequency. With 
such high-speed signals, interconnections 
as short as a few millimeters will behave 
like transmission lines. 10 It is therefore 
necessary to use impedance-controlled 
transmission line structures for all high¬ 
speed interconnect lines. 

The two most practical types of trans¬ 
mission line structures for interconnecting 
devices on the same printed circuit board 
(PCB) are microstrip (Figure 2) and 
stripline (Figure 3). The impedance (Z 0 ) 
of these types of transmission lines is deter¬ 
mined by the width ( W) and thickness (7) 
of the traces, and by the thickness (H or B) 
and dielectric constant (e r ) of the board 
material. Equations for calculating Z 0 are 
readily available. 11 

The dielectric constant of a board is, of 
course, dependent upon the type of board 
material. The thickness of the conductors 
and the distance between the conducting 
planes of the board are usually determined 
by the number of planes, the overall thick¬ 
ness of the board, and processing toler¬ 
ances. Thus the parameters of conductor 
thickness and ground plane to signal trace 
distance are usually determined by practi¬ 
cal and processing considerations. Con¬ 
ductor line width is therefore the only 
parameter that can be varied at will to con¬ 
trol the characteristic impedance of an 
interconnection. 

For TTL or MOS circuits the intercon¬ 
nection traces can be made as narrow as 
processing limitations allow, thus provid¬ 
ing more room for interconnect. However, 
the width of traces that need to be 


impedance controlled cannot be made 
arbitrarily small just for the sake of 
increased interconnection density. For 
high-speed interconnect, traces must 
usually be widened to lower their charac¬ 
teristic impedance to the normally used 
50-ohm standard. The wider the traces for 
a given size board, the fewer traces the 
board can hold. Thus the interconnection 
density of a PCB designed for high-speed 
logic may be significantly less than the 
interconnection density of a similar board 
designed for lower-speed logic. The 
stripline used on the PCBs for the 
described tester is twice as wide as it would 
be if it were designed to interconnect TTL 
or MOS chips. Because of these wide 
traces, the tester PCBs can support only 
about 50 percent of the interconnect den¬ 
sity that would be possible using minimum 
line widths. 

The interconnection problem is even 
worse when high-speed signals need to be 
sent from one PCB to another. This must 
be done with high-frequency coaxial cables 
and connectors. The described tester uses 
low-loss coaxial cables and SMA connec¬ 
tors rated at 12 GHz for circuits that are 
clocked at slightly above 1 GHz. For higher 
frequency circuits, correspondingly higher 
frequency cables and connectors must be 
used. The problem with these high- 
frequency components is that they utilize 
additional space and are difficult to work 
with when compared to printed circuit 
backplanes, wire wrap, or ribbon cable. 

The best solution to the low-density 
problem of high-speed interconnect is to 
design an architecture that minimizes the 
amount of high-speed interconnect. It may 
not be possible to eliminate all the high¬ 
speed clocks and control signals, but struc¬ 
tures such as high-speed buses should be 
avoided. The described tester’s architecture 
is such that most of the PCBs require only 
two high-speed signal interfaces from/to 
each board. 

Another technology-related problem 
awaiting the GaAs computer is the low 
density (compared to MOS, TTL, and 
ECL) of GaAs ICs. Current technology is 
such that GaAs LSI logic is just now 
becoming practical. VLSI is still years off. 
This problem works against a GaAs digi¬ 
tal system because more chips are required 
to implement a specified function, and 
more chips means more interconnect, 
which is undesirable. 

One obvious possibility is to design large 
circuit boards that can hold many ICs and 
still have room for a sufficient amount of 
microstrip or stripline interconnect. Expe- 



AC ground plane 


Figure 2. Microstrip transmission line. 



AC ground plane 


Figure 3. Stripline transmission line. 


rience has shown that large PCBs with 
many chips and interconnects usually have 
at least some long interconnect lines, 
extending perhaps as long as 12 inches. 
This approach introduces three new 
problems. 

Problem one is that of signal attenua¬ 
tion. Most PCBs are made from substances 
such as glass-epoxy or polyimide. These 
materials have a fairly high attenuation 
(loss) factor when compared to air. The 
attenuation is usually negligible at low 
(TTL and MOS) speeds or over short dis¬ 
tances of a few inches. However, when dis¬ 
tances are increased to as much as 12 
inches, signal attenuation at high frequen¬ 
cies causes pulse dispersion and begins to 
degrade noise margins. Equations for cal¬ 
culating signal loss in microstrip and 
stripline as a function of frequency of 
operation (due to skin effect and dielectric 
losses) can be found in Gupta et al. 11 

Another problem introduced by long 
PCB interconnect is related to signal 
propagation delay. Typical propagation 
delay through PCB transmission line struc¬ 
tures is usually greater than 1 ns per foot. 
For a system running with a 1-GHz clock, 
such a signal would take more than one 
clock period just to travel across a 12-inch 
board. Similar problems occur for signals 
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that travel between boards over coaxial 
cables. 

The third problem introduced by long 
PCB interconnect is crosstalk. When sig¬ 
nal lines must be routed in parallel for 
several inches, crosstalk (undesired coup¬ 
ling between lines) must be considered. 
Crosstalk is proportional to the length of 
interconnection lines and signal speed, as 
well as other parameters. It is therefore 
desirable to minimize the length of inter¬ 
connections for high-speed logic. Crosstalk 
is inversely proportional to the distance 
between two traces. In some cases it may 
be necessary to space signal traces farther 
apart than the minimum distance allowed 
by processing tolerances. This will further 
reduce interconnection density. However, 
in the described tester, signal traces are 
allowed to be as close to each other as 
processing limitations will allow (0.008 
inch). It should be noted that the tester 


uses stripline-type transmission line 
because microstrip-type transmission line 
structures have more crosstalk than 
stripline-type structures. This is because 
the wave propagation in microstrip is not 
purely transverse electromagnetic (TEM). 
Therefore, different propagation veloci¬ 
ties are obtained for each transmission 
mode. Stripline, because of its symmetry, 
is TEM and forward crosstalk components 
are precisely canceled. 

Another factor that affects crosstalk is 
signal reflections from impedance discon¬ 
tinuities. The load and source of a trans¬ 
mission line structure must be impedance 
matched to the line. Research at UCSB has 
shown that reflections can also be caused 
by receiving chips, right angle corners in 
traces, and vias (plated-through holes that 
are used to change layers). Receiver circuits 
should have high impedance inputs. The 
custom GaAs ICs for the tester use source 


follower input pad receivers with an 
extremely high input impedance. 

Printed circuit board vias that are used 
to interconnect different board layers are 
not impedance-controlled structures and 
can cause severe reflections. The use of vias 
should be kept to a minimum. It should be 
noted that thinner PCBs have shorter vias 
and thus less severe reflections. Right angle 
corners in signal traces are not as bad as 
vias, but they can still cause problems. 
Trace corners should be rounded if CAD 
tools and processing allow it. If not, use of 
corners in traces should be minimized. 

The subject of crosstalk in digital sys¬ 
tems and coupling in transmission line 
structures is too vast for further treatment 
here. Readers interested in more informa¬ 
tion on this subject should consult refer¬ 
ences 10-12, or other texts on pulse and 
digital techniques and transmission line 
theory. 
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One comprehensive solution to all the 
above-mentioned problems is to design an 
architecture that is modular. Modules 
should be designed such that all the high¬ 
speed components that need to communi¬ 
cate with each other are in the same mod¬ 
ule. The use of this technique will minimize 
the high-speed interconnect between mod¬ 
ules. When the design is implemented, each 
module can be packaged as a hybrid semi¬ 
conductor module similar to that shown in 
Figure 4. Using hybrids will minimize the 
length and maximize the density of the 
high-speed interconnect between the com¬ 
ponents. The hybrid semiconductor mod¬ 
ules contain all the chips, the high-speed 
interconnect, and termination for the inter¬ 
connect. The hybrids also contain provi¬ 
sions for distributing power and ground, 
bypassing the power supplies, and dissipat¬ 
ing heat. The described tester contains 25 
hybrid semiconductor modules, all 
identical. 

Even with a carefully designed architec¬ 
ture and hybrid semiconductor modules, it 
may still be necessary to have some high¬ 
speed signals travel distances of 12 inches 
or more. Such is the case for the clock and 
control signals of the tester. This problem 
can be overcome by using asynchronous 
communication techniques between high¬ 
speed modules. The choice of using care¬ 
fully engineered synchronous communica¬ 
tions was made for the tester project, 
however. 

If synchronous communication tech¬ 
niques are used, then special steps are 
required to maintain synchronization. In 
the described tester, all synchronized sig¬ 
nals originate from the same board, which 
is centrally located. The cables from the 
sending board to the receiving boards are 
all the same length, the length of the long¬ 
est required cable. Loading of the high¬ 
speed signals is exactly the same for every 
cable. 

In addition to these techniques, a fan¬ 
out and phasing chip has been developed. 
When a synchronous signal is received on 
a board or hybrid module, it is routed 
through the fan-out and phasing chip 
shown in Figure 5. The chip can be pro¬ 
grammed by the tester’s controlling 
microprocessor to select an experimentally 
determined amount of delay such that the 
received signal is kept in phase with the rest 
of the signals in the system. 

The last problem discussed here that 
affects GaAs system architecture is the 
problem of cost and availability of GaAs 
digital ICs. It would be nice to think that 
this problem does not exist, and perhaps it 


does not if one is only interested in an aca¬ 
demic exercise. However, the construction 
of a system such as the described tester 
would not be possible unless the architec¬ 
ture was designed around parts that can be 
obtained, and obtained within the appro¬ 
priate budgetary constraints. There are 
very few “standard” GaAs parts available, 
and most of these are of very small scale 
integration. For this reason, the tester 
architecture is a hybrid design that uses the 
minimum possible number of high-speed 
GaAs parts. In order to minimize the num¬ 
ber of GaAs chips, custom chips were 
designed specifically for the tester. These 
parts are utilized in the part of the archi¬ 
tecture that is the limiting factor for the 
tester’s speed. The rest of the system is 
designed from off-the-shelf high-speed sili¬ 
con ECL, TTL, and MOS logic. 

Tester system 
architecture 

The tester is designed around a conven¬ 
tional microprocessor system. The overall 
system architecture of the tester can be 
seen in the block diagram of Figure 6. The 
microprocessor acts as an intelligent con¬ 
troller for the tester. The main memory is 
used to store programs for the 
microprocessor, test vectors for the chip 
under test, and result vectors for analysis. 

The purpose of the high-speed interface 
modules is to provide a link between the 


low-speed (8 MHz) microprocessor and 
the chip under test, where signals are 
clocked at 1 GHz. When connected to 
input pins of a chip under test, the high¬ 
speed interface modules store up test vec¬ 
tors at low speed and then apply them to 
the chip under test at high speed. When 
connected to output pins, the high-speed 
interface modules store up result vectors 
at high-speed and then send these vectors 
on to the microprocessor at low speed. 

The clock and control module has two 
jobs. Job one is to distribute the high¬ 
speed clock, which is generated by an 
external programmable frequency gener¬ 
ator. Although it is not shown in the block 
diagram, the frequency generator is 
attached to the microprocessor through an 
I/O port so that the microprocessor can 
control the clock frequency. Job two of the 
clock and control board is to synchronize, 
generate, and distribute all the required 
high-speed control signals. 

The main purpose of the test head is to 
provide a method for connecting the chip 
under test to the high-speed interface mod¬ 
ules without having to solder the chip 
down. The test head provides impedance- 
matched 50-ohm connections to all 24 I/O 
pins, and also provides the chip under test 
with power, ground, terminating voltage, 
and a method for dissipating heat. 

Not shown on the block diagram of Fig¬ 
ure 6 are the power supplies. The power 
supplies for the chip under test are com¬ 
pletely separate from the supplies that 
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Figure 6. Overall system architecture of the tester. 
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Figure 7. Memory-multiplexer architecture. 


power the tester, and are connected to the 
microprocessor through an I/O port. This 
allows the microprocessor to control the 
power supply voltages to the chip under, 
test. This makes the tester more versatile 
because it can test any chip, regardless of 
the chips power requirements, so long as 
the chip has appropriate external logic 
levels and noise margins. 

The overall system design of Figure 6 
has many advantages and few disadvan¬ 
tages. Some of the more important advan¬ 
tages of this system design are the 
following: 

(1) All the high-speed parts of the tester 
can be controlled by the microprocessor, 
and because the microprocessor can be 
controlled by the host system ,4he tester is 
versatile, easy to use, and easy to 
reprogram. 

(2) There is a minimum number of 
modules running at 1 GHz and, with the 
exception of the clock and control board, 
these modules are all the same. 

(3) The tester can store a large amount 
of test vectors, result vectors, microproces¬ 
sor software, and other information in its 
main memory. 

(4) The tester can be easily expanded to 
test chips that have more than 24 I/O pins 
by adding more high-speed interface 
boards. 

(5) Expansion to a stand-alone test sys¬ 
tem is straightforward, by adding a hard 
disk and an operating system. 

The most crucial part of the tester, and 
the most difficult to design, is the high¬ 
speed interface module. Three competing 
architectures were evaluated before 
proceeding with the design of these mod¬ 
ules. Selection of an architecture was 
based entirely on its ability to perform the 
necessary tasks at the required speed (1 
GHz). The primary issues concern (1) the 
local storage of test and result vectors dur¬ 
ing the performance of a test, (2) the 
parallel-to-serial conversion of test vectors 
(in the input mode), and (3) the serial-to 
parallel conversion of test results (in the 
output mode). Two of the architectures 
(memory-multiplexer and memory-shift 
register) provide for the storage of test vec¬ 
tors and result in a high-speed (ECL) 
RAM. The third architecture utilizes a 
long chain of GaAs shift registers for local 
data storage. 


Memory-multiplexer architecture. The 

memory-multiplexer architecture (Figure 
7) provides, for each pin of the chip under 
test, local storage capabilities for a 4096-bit 
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test vector (or test result). Data is stored as 
32-bit words in an ECL RAM. The custom 
GaAs components required in this archi¬ 
tecture are a multiplexer, a demultiplexer, 
and some latches (not shown in Figure 7). 
Parallel-to-serial conversion of input test 
data is accomplished via a 32:1 multiplexer, 
controlled internally by a 5-bit counter. 
Conversely, serial-to-parallel conversion of 
test results is provided by a 1:32 demul¬ 
tiplexer. 

The operation of the tester proceeds as 
follows. When configured for input (i.e., 
the pin under test is an input pin), data is 
' loaded into the RAM in parallel from the 
microprocessor bus. RAM addresses are 
generated by the 8-bit counter, under con¬ 
trol of the microprocessor. Loading of test 
data necessarily occurs at the speed of the 
MOS components and testing begins only 
after the entire test vector has been loaded 
into the RAM. 

On initiation of the test, RAM address¬ 
ing is again accomplished via the 8-bit 
counter, now running at the full clock 
speed of the test divided by 32. Thirty-two- 
bit words are applied to the multiplexer for 
parallel-to-serial conversion while, simul¬ 
taneously, the next 32-bit word is accessed 
from the RAM. In the output mode, the 
demultiplexer/counter provides for the 
serial-to-parallel conversion of test results 
in an analogous manner to that described 
above. 

A detailed analysis of the timing and 
synchronization requirements of this archi¬ 
tecture brought to light a number of diffi¬ 
culties in implementation (see Figure 7). 
* The most serious of these is the need for 
two high-speed clocks (“clock” and 
“clock/32” in the figure) operating in 
phase. At l^e full speed of the tester, cor¬ 
rect operation requires tight synchroniza¬ 
tion between the ECL components 
operating at 31.25 MHz (in the figure, the 
8-bit counter and RAM), and the custom 
GaAs components at 1 GHz (the 5-bit 
counter, multiplexer, demultiplexer, and 
latches in the figure). 


Memory-shift register architecture. The 

memory shift-register architecture (Figure 
8) for the high-speed interface modules is 
very similar to the memory-mux architec¬ 
ture in that local data storage is provided 
via an ECL RAM. The principal difference 
is that the parallel-to-serial/serial-to- 
parallel conversions are carried out by a 
GaAs shift register (with parallel I/O capa¬ 
bilities) rather than the multiplexer/ 
demultiplexer combination. Operation is, 
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Figure 8. Memory-shift register architecture. 


of course, similar to that of the memory- 
mux architecture. 

The advantage of this architecture over 
the memory-multiplexer architecture is the 
use of a single (and possibly simpler) cus¬ 
tom GaAs chip; all other components are 
off-the-shelf parts. However, the disadvan¬ 
tages associated with the complexity of 
control and synchronization at high speeds 
remain. 

The shift register architecture. The third 
potential architecture for the high-speed 
interface module utilizes a chain of GaAs 
shift registers for local data storage, remov¬ 
ing the control complexity imposed by the 
memory-based architectures. Figure 9 illus¬ 
trates this architecture. 

This approach maintains a number of 
advantages over the memory-based 
architectures and, as such, was chosen for 
implementation. Here, the speed of the 


tester is limited by the speed of the shift reg¬ 
ister chain, rather than by the control cir¬ 
cuitry. Additionally, only a single 
high-speed clock is required during test 
performance. 

Two custom GaAs chips are required in 
this implementation: a 32-bit serial I/O 
shift register and a 4-bit parallel/serial I/O 
shift register. The size of the serial I/O shift 
register is constrained by die acreage and 
the parallel I/O component by packaging 
limitations (a 36-pad frame was provided 
for all the multiproject GaAs chips). 

Operation of this tester is considerably 
simpler than the memory-based architec¬ 
tures discussed previously. In the input 
mode test vectors are loaded into the shift 
register chain, in parallel 4-bit words, from 
the microprocessor bus. The register is then 
shifted left 4 bits and another word is 
loaded. This cycle continues until the entire 
test vector is loaded into the shift chain. On 
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Figure 9. Shift 
register architecture. 
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4-Bit shift register with parallel I/O 



initiation of the test, the data are then 
applied to the pin under test at the speed 
of the programmable clock. Output pins 
are tested in a complementary fashion, 
with test results provided to the output shift 
register chains. 

The current design allows for a test vec¬ 
tor length of 264 bits. Expansion of test 
vector length can be accomplished by 
adding more shift registers to the chain. 

Tester implementation 

The tester is constructed around a VME 
system card cage and backplane. This 


allows the use of off-the-shelf micro¬ 
processor components for controlling the 
tester. The VME bus was chosen over 
other high-performance microprocessor 
buses because it has a large amount (64) of 
uncommitted backplane pins. These are 
used for distribution of power to the 
boards containing GaAs and ECL compo¬ 
nents, and as auxiliary grounds. The CPU 
board has a Motorola MC68000 
microprocessor running at 8 MHz and a 
built-in PROM monitor. The CPU board 
also has three RS-232 serial ports for exter¬ 
nal communication and control. The 
RAM board contains 512Kbytes of mem¬ 


ory plus refresh circuitry. Future plans 
include the addition of a hard disk so that 
the tester can maintain its own library of 
diagnostics, automatic test generation 
software, and a small specialized operat¬ 
ing system. 

The tester’s architecture follows all the 
guidelines that were set forth previously. 
The architecture is such that there are no 
high-speed buses, and thus very little high¬ 
speed interconnect exists between boards. 
The small amount of high-speed intercon¬ 
nect that does run between boards is done 
with coaxial cables and SMA connectors. 
The architecture is modular in design with 
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most of the GaAs logic being contained in 
hybrid semiconductor modules of the type 
previously described. The modules are 
built on a 10-layer polyimide structure 
laminated to a copper substrate. Five of 
the polyimide layers are conducting layers 
and are used to create a 50-ohm stripline- 
type transmission line for use as a high¬ 
speed interconnect. Each module contains 
12 custom GaAs digital ICs, 36 microwave 
“chip”-type capacitors for power supply 
bypassing, and several microwave “chip”- 
type resistors for terminating a high-speed 
interconnect. To aid in the dissipation of 
heat, all the ICs are mounted on thermal 
conduction columns connected through 
the polyimide layers to the heavy copper 
substrate. There are 15 low-speed I/O pins 
on each module, and four SMA connec¬ 
tors for high-speed I/O. A block diagram 
of the hybrid modules is shown in Figure 
10. 

A few of the chips used in the tester will 
not be placed in hybrid modules. These 
chips are to be packaged in leadless 
ceramic chip carriers that have been spe¬ 
cially designed for packaging high-speed 
digital logic. 13 These are the same carriers 
used to package the chips to be tested by 
the tester. Because of certain restrictions, 
details on the carriers cannot be given in 
this article. Readers that are interested in 
the leadless ceramic chip carriers should 
obtain a copy of reference 13. 

A block diagram of the high-speed inter¬ 
face boards is shown in Figure 11. There 
is one of these boards for every pin of the 
chip under test. Each board consists of one 
hybrid semiconductor module, and some 
associated logic for addressing, control, 
and logic level translation. Each board is 
configured as either an input or an output. 
If a board is configured as an output, then 
the serial output of the board’s hybrid 
module is connected to the test head. If the 
board is configured as an input, then the 
serial input of the board’s hybrid module 
is connected to the test head. Connections 
are made with coaxial cables and 12-GHz 
SMA connectors. 

A block diagram of the clock and con¬ 
trol board is shown in Figure 12. The main 
parts of this board are a pulse generator 
and some fan-out logic. The pulse gener¬ 
ator is capable of generating a shift regis¬ 
ter control pulse that is from 1 to 264 
clocks in length, and is synchronized with 
the high-speed clock. This is accomplished 
by loading a string of l’s of appropriate 
length into a high-speed shift register chain 
identical to those used for interfacing 
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Figure 11. High-speed interface boards. 
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Figure 12. Clock and control board. 
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to/from pins of the unit under test. As this 
register chain is clocked at full test speed, 
the result is the desired programmable 
pulse. After it has been generated, the shift 
control signal and the clock are distributed 
to the 24 high-speed interface modules. 
There is also some logic on the clock and 
control board for bus interface, address 
selection, and control. 

The hybrid semiconductor modules and 
other components are mounted on fairly 
standard but carefully designed PCBs. The 
PCBs are 0.060-inch-thick multilayer 
glass-epoxy boards, as shown in Figure 13. 
There are seven conducting layers, five of 
them buried. The signal layers are sand¬ 
wiched between two ac ground planes. 
This creates a 50-ohm stripline transmis¬ 
sion line, which is used for all on-board 
high-speed interconnects. The high-speed 
interconnect is terminated with 50-ohm 
chip-type resistors, and all power planes 
are bypassed at all chip locations with 
chip-type capacitors. Off-board high¬ 
speed interconnect is done with coaxial 
cables and SMA connectors. 

The purpose of the test head was men¬ 
tioned in the previous section. Figure 14 
shows the test head, along with two lead¬ 
less ceramic chip carriers that fit into the 
test head. The test head fixture is con¬ 
structed of a polyimide structure that is 
laminated to a copper substrate. The poly¬ 
imide structure provides 50-ohm 
impedance-matched connections between 
the socket for the chip and the SMA con¬ 
nectors on the back of the substrate. 


Tester operation 






Figure 14. (a) Test head, top view; (b) 


The tester is not a stand-alone unit, but 
operates as a slave to a personal computer, 
workstation, or mainframe computer. In 
Figure 6, a test is started by the host sys¬ 
tem loading input test vectors and other 
information into the tester’s main memory 
through the RS-232C port. The micro¬ 
processor then takes over control of the 
tester and transfers the input test vectors 
and clock phasing commands, via the 
microprocessor bus, to the appropriate 
high-speed interface modules. Next, the 
frequency generator and power supplies 
are programmed to the values specified by 
the test. The clock and control board is 
then programmed with the length of the 
test vectors; this is done via the 
microprocessor bus. 

At this point, everything is ready and the 
clock and control board is given a signal to 
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start the test. The clock and control board 
sends out a shift command to all the high¬ 
speed interface boards. The high-speed 
interface boards connected to input pins of 
the chip under test shift their data out to 
the test head. The high-speed interface 
boards connected to output pins of the 
chip under test shift in data from the test 
head. After the appropriate number of 
clocks (as specified by the length of the test 
vectors) the clock and control board stop 
the shifting, and control is returned to the 
microprocessor. 

When a high-speed test has thus been 
completed, the result vectors are unloaded 
from the high-speed interface boards at 
low-speed, and stored in the main mem¬ 
ory. The result vectors can either be sent 
back to the host or can be analyzed by the 
microprocessor. The tester can also be put 
into a loop mode, where the same test or 
a series of tests can be run over and over 
again. This mode of operation is useful for 
oscilloscope viewing and for testing certain 
state-intensive chips, for example, 
counters. 

The tester also has the ability to perform 
functional tests on itself. This is useful for 
the following three reasons: 

(1) It provides a means for initial 
debugging of the tester after construction. 

(2) It provides a means for debugging 
the tester should the system fail. 

(3) It serves as a confidence test to 
ensure that the tester is functioning 
properly. 

The tester’s self-test operates under the 
assumption that the power supplies are 
working correctly, that the RS-232C inter¬ 
face to the host system is working cor¬ 
rectly, and that there is no failure in the 
system jamming the microprocessor bus. 
The test has three parts: a CPU board con¬ 
fidence test, a main memory confidence 
test, and a high-speed interface module 
functional test. 

The CPU board confidence test is stored 
in EPROM on the CPU board and is run 
whenever the system is powered up, or 
whenever the reset switch on the front 
panel is pushed. The test exercises the 
microprocessor, the EPROM, and the 
memory mapping and protection circuitry 
to determine if they are functioning 
properly. If errors are detected, then 
appropriate messages are sent to the host 
system. 

The main memory confidence test is also 
stored in EPROM on the CPU board. It is 
automatically run after the CPU board 
confidence test is run. It tests all locations 


in memory for stuck bits, shorted bits, and 
address independence. Errors are reported 
to the host system. 

The high-speed interface module func¬ 
tional test is performed under the control 
of the host system, and can be run when¬ 
ever the user feels the need. The test 
requires that the user connect the input of 
every high-speed interface module to its 
own output. The host system then down¬ 
loads specific test vectors to the tester. The 
tester is programmed to load the test vec¬ 
tors into the high-speed interface modules, 
and to program the clock and control 
board with the appropriate test vector 
length. The test vectors are then shifted out 
of each high-speed interface module and 
back around to each module’s input. The 
vectors are then read out of the high-speed 
modules and checked for data corruption. 
Any errors are reported to the host system. 
This process is repeated several times with 
test vectors of various sizes, including the 
smallest and largest possible test vectors. 

Future plans for the tester include the 
addition of a hard disk and a specialized 
operating system. With these additions it 
will be possible for the tester to operate as 
a stand-alone system, not requiring a host 
computer or workstation. Future plans 
also include the porting to the tester of 
automatic test generation software that is 
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W e have designed and are build¬ 
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One way to overcome 
the joint effects of low 
chip density and long 
off-chip delays in 
GaAs systems is by 
migrating some tradi¬ 
tional hardware func¬ 
tions into software 
and by performing 
appropriate compile¬ 
time optimizations. 


G allium arsenide (GaAs) has 
begun to attract serious attention 
from computer system designers, 
in addition to the traditional interest of 
device physicists. This is due not only to 
the much heralded switching speed advan¬ 
tages of GaAs, but also to the rapid pro¬ 
gress seen in GaAs digital chip densities 
over the past few years. Lured by poten¬ 
tial applications in supercomputer, mili¬ 
tary, aerospace, telecommunications, and 
high-speed testing areas, several compa¬ 
nies are currently pursuing GaAs tech¬ 
nology. 

Currently, such companies as Gigabit, 
Harris, Vitesse, and TriQuint Semicon¬ 
ductor have already been marketing GaAs 
chips for digital applications. 1 In addi¬ 
tion, several laboratory designs exhibiting 
higher performance have been presented; 
these include a 2000-gate gate array 2 and 
a 16K-bit static RAM, 3 with transistor 
counts of 8.2K bits and 102.3K bits, 
respectively. 

These new GaAs designs are expected to 
eventually replace silicon emitter-coupled 
logic (ECL) circuits in current SSI/MSI- 
based vector supercomputer implementa¬ 
tions. However, GaAs technology also 
offers the potential for dramatic perform¬ 
ance improvements for general-purpose 
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applications not exhibiting the highly regu¬ 
lar data required for efficient vector super¬ 
computer execution. 

GaAs computer system design has 
already generated considerable interest. 
GaAs computer design considerations 
have been presented earlier by Gilbert of 
the Mayo Clinic. 4 According to Karp of 
DARPA, the US government sponsored 
three teams in relation to a 32-bit GaAs 
microprocessor effort. 5 The participating 
companies were McDonnell Douglas, 
Texas Instruments with Control Data, and 
RCA. DARPA’s creative leadership has 
advanced the state of GaAs technology 
and GaAs computing in the United States 
tremendously. As a result, an 8-bit proces¬ 
sor design based on GaAs technology has 
already been presented by RCA. 6 

In this article we describe an approach 
to computer system design that we feel is 
very attractive for GaAs technology. Our 
strategy involves the use of a single-chip 
GaAs processor, an increased role for the 
compiler, and an aggressive migration of 
functions from hardware to the compiler. 
In fact, we believe that the advantages of 
GaAs technology cannot be fully exploited 
without further developments in compiler 
technology. This article follows previous 
papers on GaAs processor design 7 and 
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GaAs system design, 8 and completes our 
overview of GaAs technology-based com¬ 
puter system design. 

Explanation of our 
methodology 

In this section we describe the strategy 
that we believe is desirable for GaAs 
computer system implementations. As will 
be seen, in addition to drastic improve¬ 
ments in packaging technology, compiler 
improvements will be necessary to allow 
GaAs to approach its potential. We believe 
that packaging and compilation are the 
major two bottlenecks in the area of GaAs 
computing. 

We begin by describing the characteris¬ 
tics of GaAs that affect our system design 
strategy. We then describe how sophisti¬ 
cated compiler capabilities have already 
been introduced by the designers of 
reduced instruction set computer (RISC) 
processors. We conclude by demonstrat¬ 
ing that the increased reliance on compilers 
typical for silicon RISCs is very advanta¬ 
geous for GaAs processors, and, in fact, 
the characteristics of GaAs indicate that an 
even stronger compiler capability is 
desirable. 

GaAs characteristics relevant for computer 
system design. Several different device and 
logic circuit designs have been used in 
GaAs implementations. However, because 
of the superior maturity of published 
GaAs enhancement-mode driver/de¬ 
pletion-mode load, metal-semiconductor 
field effect transistor (E/D MESFET) 
designs, we will limit our discussions to the 
GaAs MESFET family. It is typically used 
in direct-coupled FET logic (DCFL) circuit 
configurations. 9 

There are three principal characteristics 
of E/D-MESFETs that differentiate GaAs 
computer design techniques from those 
used for silicon designs. 

One, GaAs has an inherent advantage in 
gate switching speed. This is due to the 
higher mobility of GaAs electrons—six to 
eight times that of silicon electrons. 9 

Two, the on-chip speed advantage that 
GaAs enjoys is not matched by an equal 
off-chip speedup. Interchip signal propa¬ 
gation delays are determined primarily by 
packaging considerations rather than inte¬ 
grated circuit technology 8 ; therefore, more 
time is lost in off-chip communication, in 
terms of instruction cycles, for GaAs 
processors than for silicon processors. This 


is an area that should benefit greatly from 
improved compiler technology. 

Three, the lower maximum transistor 
count of GaAs chips further limits the 
capability of GaAs designs. Current prob¬ 
lems with yield and the higher power 
requirements of GaAs limit transistor 
counts of GaAs chips to roughly one-tenth 
the corresponding value for silicon MOS 
chips. 7 Therefore, it is mandatory that 
some traditional hardware functions be 
migrated into the compiler to overcome the 
negative effects of having to build the sys¬ 
tem with a smaller number of devices. 


Compiler advances associated with silicon 
RISCs. The RISC philosophy has caused 
a reevaluation of computer architecture 
design methodology. RISCs have (suppos¬ 
edly) shown better performance than the 
customary complex instruction set com¬ 
puter (CISC) style by utilizing streamlined 
instruction sets (see, however, Fleming and 
Wallace 10 ). Example RISCs include the 
University of California-Berkeley RISC I 
and II (UCB-RISC), 11 the Stanford Uni¬ 
versity MIPS (SU-MIPS), 12 and the IBM 
801. 13 Much of the credit for the RISC 
performance advantage is given to the fact 
that RISCs execute frequently occurring 
instructions very fast. What is frequently 
overlooked are the requirements that RISC 
architectures place on compiler technology 
in order to achieve these results. 

First, in order to help minimize the 
instruction cycle time, a conscious effort to 
transfer hardware functions into software 
is performed. Examples of this are the 
elimination of hardware timing hazard 
interlocks 14 in the SU-MIPS and the 
elimination of hardware sequencing haz¬ 
ard interlocks 14 in the UCB-RISC and SU- 
MIPS. The IBM 801 has hardware inter¬ 
locks, but the optimizing compiler tries not 
to use them. 

Also, in order to reduce decoding com¬ 
plexity and achieve a shorter cycle time, the 
instruction set is reduced to a small set of 
primitives comparable to vertical microin¬ 
structions on a microprogrammed ma¬ 
chine. This presents new opportunities in 
that many compiler optimizations not 
available on CISC macroinstruction 
sequences are possible for RISC instruc¬ 
tion sequences because the compiler has 
access to the fine-grained RISC primitives. 
One example of this is the data load fill-in 
of the IBM 801 compiler. 13 As described 
later, a RISC compiler is able to reduce the 
performance degradation due to long- 
latency data loads by scheduling other use¬ 


ful instructions during the latency period. 
In a memory-to-memory instruction of a 
CISC, however, the full data fetch latency 
must be absorbed because the compiler is 
not able to (and does not) “get inside” this 
type of course-grained atomic instruction. 

A third reason for increasing compiler 
capabilities is the fast rate of RISC instruc¬ 
tion execution itself. An extremely fast 
processor such as a RISC will not achieve 
its potential unless its surrounding environ¬ 
ment is able to support this speed. Since 
off-chip memories have traditionally not 
been able to do so, except for relatively 
small memories, compiler algorithms to 
reduce the penalty induced by the off-chip 
environment have been developed. The 
compiler for the IBM 801 uses a very 
sophisticated register coloring algorithm in 
order to maximize the register lifetime and 
minimize the need to fetch off-chip data. 15 
The IBM 801 also incorporated special 
instructions that allow the compiler to 
override the runtime caching mechanism 
whenever the compiler determines a better 
placement policy. 13 The SU-MIPS 
designers developed a packing scheme to 
allow their processor to fetch two instruc¬ 
tions concurrently, 12 a strategy that is 
especially effective in environments with 
slow off-chip memory access and that 
required the SU-MIPS compiler to per¬ 
form the instruction packing. 

GaAs computer system design. We are 
interested in computer system designs that 
utilize VLSI (>10,000 transistors) GaAs 
chips for both the processor and the 
highest level(s) of the memory hierarchy 
(cache). VLSI-based designs are desirable 
because they require less board area and 
power, and exhibit greater performance 
and reliability than systems built exclu¬ 
sively from SSI and MSI parts. In a GaAs 
processor implementation, VLSI designs 
are especially advantageous because they 
minimize the need for interchip communi¬ 
cation. 

Single-chip VLSI GaAs processors must 
necessarily inherit the general characteris¬ 
tics of RISCs that make the processors so 
compiler-dependent for optimal perform¬ 
ance. In fact, the successful exploitation of 
GaAs technology in single-chip processor 
designs requires even stronger compiler 
capabilities. Obviously, the limited transis¬ 
tor count of the processors dictates that the 
transfer of functionality from hardware to 
the compiler is highly desirable for GaAs 
designs. Also, the extremely fast switching 
speeds of GaAs increase the need for the 
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compiler to reduce the negative effects of 
a slow off-chip environment. 

As already indicated, the successful 
exploitation of GaAs in single-chip proces¬ 
sor configurations depends considerably 
on our ability to utilize compiler capabili¬ 
ties to their fullest. In the silicon domain, 
very high levels of integration are available 
to better enable the full use of increasingly 
fast data path transistors in a relatively 
slow off-chip environment. In the GaAs 
environment fast gate switching speeds are 
accompanied by a lower level of integra¬ 
tion. Without the assistance from an 
increasingly sophisticated compiler tech¬ 
nology, GaAs processors will struggle to 
attain a speed advantage over silicon 
processors near the level marked by their 
gate speed advantage. We believe this com¬ 
piler technology-GaAs technology rela¬ 
tionship is analogous to the CAD-VLSI 
relationship of several years ago, in that the 
full capabilities of VLSI could not be fully 
utilized without appropriate advances in 
CAD technology. 


Compiler enhancement 
and compiler migration 
candidates 

Several functions that may be wanted in 
a GaAs system compiler fall into one of the 
two following general groups of compiler 
functions: 

(1) functions traditionally implemented 
in hardware, which could be migrated into 
the compiler (for better exploitation of the 
scarce on-chip transistor count), or migra¬ 
tion candidates ; and 

(2) functions to add to the compiler (to 
enhance its capabilities in fighting the 
increased off-chip communication delays), 
or enhancement candidates. 

We briefly describe each compiler func¬ 
tion and discuss the rationale behind its 
selection. Then we present algorithms 
which may be used to implement each 
function, and discuss the areas where algo¬ 
rithm advances are necessary. Several of 
these compiler functions are borrowed 
from silicon designs, but the characteristics 
of GaAs technology make them differ con¬ 
siderably in their implementation. For 
example, the number of branch delay slots 
to fill in is now about one-half order of 
magnitude larger; also the number of 
potential pipeline hazards has increased. 

Our presentation is divided along 
architectural boundaries. We first discuss 


two control functions that are typically 
implemented in hardware. Next, we present 
several memory system functions for which 
compiler support may be beneficial. We 
continue with two compiler functions 
associated with arithmetic, and two com¬ 
piler functions associated with various 
instruction formats. We conclude with 
external communications. 

Control. There are two migration candi¬ 
dates for compiler implementation that 
have traditionally been performed by hard¬ 
ware in silicon pre-RISC designs. These are 
examples of a direct migration of hardware 
responsibilities into the compiler. 

Pipeline interlock (timing hazards): compi¬ 
lation candidate. The pipelining of proces¬ 
sors introduces the potential for conflicting 
claims for resources among multiple pipe¬ 
line stages and/or multiple instructions. 
These conflicts may be referred to as tim¬ 
ing hazards, 14 three classes of which are 
destination-source conflicts, source- 
destination conflicts, and destination- 
destination conflicts. A destination-source 
conflict occurs when the result of a pipe¬ 
line stage has not been stored before a suc¬ 
ceeding pipeline stage requires it. A 
source-destination conflict occurs when a 
pipeline stage of instruction i requires a 
result produced from an earlier pipeline 
stage in instruction i + k. A destination- 
destination conflict occurs when a resource 
is written concurrently by pipeline stages 
of two different instructions. If not 
avoided, all three of these conflicts may 
result in incorrect execution of an other¬ 
wise correct program. 

The pre-RISC silicon CISC approach to 
these conflicts has traditionally been a 
hardware one. For example, the CDC 6600 
utilized a ‘ ‘scoreboard” to keep track of its 
hardware resources at runtime. 16 In addi¬ 
tion to approaches such as this, some sili¬ 
con RISCs also employ hardware 
solutions. The UCB-RISC utilized an 
internal forwarding bus in order to elimi¬ 
nate one possible cause of destination- 
source conflict. This was used for situa¬ 
tions when the result of an ALU operation 
was needed as a source operand for the suc¬ 
ceeding operation. 

Some silicon RISC designers have gone 
to extremes in avoiding the use of hardware 
resources for resolution of these conflicts. 
The SU-MIPS designers required the com¬ 
piler to avoid generating instruction 
sequences with any such conflict. They 
even declined to provide an internal for¬ 
warding bus as was used in the UCB-RISC. 


An SU-MIPS designer claimed that, in 
addition to a reduction in hardware com¬ 
plexity, software pipeline interlocks 
reduced the instruction cycle time by 10 
percent. 17 

The software solution to timing hazard 
detection and avoidance is advantageous 
for GaAs processors primarily because of 
its reduction in hardware resource require¬ 
ments, and also because of the possible 
reduction in instruction cycle time it allows. 
Because GaAs instruction pipelines can be 
expected to be longer than silicon pipe¬ 
lines, 7 the pipeline interlock hardware for 
a GaAs processor would probably be even 
more complex. The hardware resources not 
required for timing hazard prevention can 
have a great positive impact on perform¬ 
ance in a resource-scarce GaAs processor, 
if they can be utilized elsewhere within the 
processor. 

The longer instruction pipelines 
expected for GaAs processors not only 
increase the benefits of a software 
approach, but also change the require¬ 
ments placed on the compiler algorithms 
that implement the pipeline interlocking 
function. Compiler algorithms used for 
relatively shallow silicon pipelines are 
generally not efficient in the case of rela¬ 
tively deep GaAs pipelines; therefore, as 
discussed next, advances in compiler tech¬ 
nology (with respect to this issue) will 
greatly increase the performance of GaAs 
processors. 

Pipeline interlock (timing hazards): com¬ 
piler algorithms. Many of the compiler tech¬ 
niques in migrating function from 
hardware to software are based on the 
method of “code reorganization,” or 
“horizontal code compaction,” assuming 
that more than a single machine primitive 
is executed from a fetched instruction 
memory unit. The technique is primarily 
useful in the removal of timing hazards, 
but it can be used in a number of different 
ways, as will be discussed later. 

The task of reorganization, then, is to 
accept an input program and to rearrange 
the instructions so that no timing hazards 
are present in the output program. Specif¬ 
ically, the instructions are to be arranged 
taking cognizance of pipeline delays so that 
data is passed between these instructions 
correctly. For instance, if there is a two- 
cycle delay between the initiation of an 
instruction and the writeback of its calcu¬ 
lated result, then the reorganizer must 
ensure that at least two instructions are 
placed between the instruction and any 
other instruction that utilizes the result. 
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In addition, the reorganizer must also 
ensure that no machine utilization con¬ 
straints are violated. Such constraints may 
be formulated as statements like “loads 
may occur no more frequently than every 
six cycles” (typical for some GaAs sys¬ 
tems), or “the bits of these instructions 
must not cross over a quadword bound¬ 
ary.” In the following discussion, these 
constraints are referred to as “resource 
requirements.” For a particular architec¬ 
ture/implementation, the specification of 
resource requirements may range from the 
trivial (none is needed) to the simple (each 
cycle has a small set of “resources” avail¬ 
able and a subset of these is “consumed” 
on each cycle) to the very complex (as for 
highly horizontal architectures with several 
levels of decoding). Fortunately, the last 
case is not of particular interest for GaAs 
implementations because of the limited 
number of gates and pins available. 

Thus, reorganization may be viewed as 
a two-pronged problem, that is, a sequenc¬ 
ing problem dealing with the input and 
output registers of the instructions, and a 
resource utilization problem with the 
machine resource requirements of the 
instructions. The overall procedure of reor¬ 
ganization can be specified as follows: 
first, convert the input program into a 
directed acyclic graph, called the data 
precedence graph or DPG, that captures all 
of the register sequencing constraints, and, 
second, schedule each instruction for exe¬ 
cution according to some topological 
ordering on the DPG. Of course, the 
scheduler must keep track of what 
resources are available on each cycle so that 
it can determine the feasibility of issuing 
a particular instruction on a particular 
cycle. Moreover, depending on the exact 
form of the initial DPG and on the 
scheduling algorithm, the DPG may 
require updating as the scheduling ensues. 

In the DPG, there are three types of rela¬ 
tionships that must be represented: (1) data 
dependency, (2) data antidependency, and 
(3) output dependency. These relationships 
are straightforward and easy to under¬ 
stand. An instruction 12 is data dependent 
on instruction II by register r if II writes a 
value into register r that is subsequently 
read by 12. In the DPG, we would have a 
data dependency arc from II to 12 labeled 
with register r. Second, 12 is data 
antidependent on II by register r if II must 
read a value from register r before 12 over¬ 
writes that value. This situation is repre¬ 
sented by a data antidependency arc from 
II to 12 labeled with r. Last, 12 is output 
dependent on II by register r if II must 


write register r before 12 writes register r; 
here, an output dependency arc is needed. 
The first two relationships are clearly 
required to ensure each instruction reads 
the correct values; the most intuitive use of 
output dependencies is to ensure the pres¬ 
ervation of live registers. As it happens, 
there are less obvious ones. A thorough 
treatment of building and maintaining the 
DPG is found in Linn. 18 

The arcs of the DPG represent not only 
data sequencing constraints but timing 
constraints as well. For example, suppose 
that we have a data dependency arc labeled 
with register r3 from II to 12, that II writes 
register r3 in the sixth cycle after issue, and 
that 12 reads r3 in the third cycle after issue. 
Then, clearly, we may deduce that 12 may 
not be issued fewer than three cycles after 
II. The best situation is obtained if there 
are other instructions from the program 
that are available for execution in these 
three cycles. Otherwise, “No-Operation” 
instructions (No-Ops) must be used. Obvi¬ 
ously, the true instruction execution rate is 
reduced whenever these extra No-Ops are 
required. 

The scheduling algorithm operates by 
considering at each time t all of the instruc¬ 
tions that are ‘ ‘available’ ’ at time t, that is, 
each instruction with both the property of 
having all of its ancestors in the DPG hav¬ 
ing already been scheduled sufficiently far 
in advance to satisfy write/read delays and 
the property that all of the resources 
required for execution are available. Of 
these, the “best” one (according to some 
heuristic) is chosen to be issued at time t. 
If more than one instruction may be issued 
from the same instruction register load, the 
process is repeated until no more instruc¬ 
tions can be issued in the current cycle. 
Then the algorithm applies the consider¬ 
ations at time t + 1; this is continued until 
no more instructions remain to be 
scheduled. 

Let us consider a simple example to get 
a flavor of this process. In Figure 1 (a-b), 
a fragment of an HLL program and its 
realization in a MIPS-style assembly lan¬ 
guage are depicted. Here, we assume the 
rK, rL, and rM are registers that contain 
the values of K, L, and M from the pro¬ 
gram fragment, respectively; this conven¬ 
tion continues throughout our discussion. 
Figure lc shows the DPG. Let us further 
assume that the architecture has the follow¬ 
ing properties: 

• A three-address instruction, op 
rl,r2,r3, executes in a three-stage pipe¬ 
line, where rl is read in the zeroth cycle 



Figure 1. (a) Fragment of program in 
HLL. (Note the ~ is the logical “and” 
operator.) (b) Fragment of program in the 
MIPS-style assembly language, (c) DPG 
of the program fragment. 


after issue, r2 in the first cycle after 
issue, and r3 is written in the second 
cycle after issue. The reads on a cycle 
occur before the writes; the result of 
this is that if r3 is utilized by a subse¬ 
quent instruction, then two cycles 
must intervene if r3 is the first read 
register, but only one if r3 is the 
second. 

For a load A[rl],r2, rl is read on the 
first (origin zero) cycle, and r2 is writ¬ 
ten in the fourth cycle. Again, an 
instruction may not read a value from 
a register in the same cycle that it is 
written into the register. 
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• Because of a lack of external memory 
address buffers, a load instruction 
may not be issued in the cycle immedi¬ 
ately following another load 
instruction. 

• Only one instruction may be issued 
from a single instruction register load. 

Now, let us trace the execution of the reor¬ 
ganizer in Figure 2. 

Figure 2a illustrates the initial setup. 
Note that instructions II and 14 each has 
all of its ancestors already scheduled. Thus, 
both are available at time = 0. For this 
architecture, we define a pseudoresource m 


to keep track of whether a load instruction 
may be issued in any particular cycle. Ini¬ 
tially, load instructions are enabled for all 
cycles. On the basis of the fact that II has 
one more descendent than 14, we choose to 
schedule II. The updated situation is 
shown in Figure 2b. Note that II has been 
marked as scheduled, the available time for 
14 has been updated, and 12 has become 
available, though not until time = 2. Also, 
note that if rK were read on the zeroth cycle 
in 12, instead of the first, then 12 would not 
have become available until time = 3. At 
time = 1,14 is the only instruction availa¬ 


ble. Figure 2c shows the updated state at 
time = 1. 

At time = 2, 12 is the only available 
instruction. After verifying that its 
required resources are available at time = 
2,12 is scheduled there (as depicted in Fig¬ 
ure 2d). Note that the scheduling of 12 also 
causes the removal of m at time 5 3. This 
prevents the “two-reads-in-a-row” prob¬ 
lem. At time 5 3,15 is data available, but 
cannot be scheduled because the m is not 
available. Since no instruction is both data 
and resource available, a No-Op is gener¬ 
ated, and 15 is then scheduled at time = 4. 
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Since 13 is not available until time = 6, 
another No-Op is generated at time = 5. 
The state after time = 5 is shown in Fig¬ 
ure 2e. 

Figure 2f shows the situation after 13 is 
scheduled at time = 6.16 becomes availa¬ 
ble at time = 9 since it may not read rK 
until three cycles after 13 is issued. Since 
“and” is commutative, it might sometimes 
make sense to have the reorganizer auto¬ 
matically recode 16 as “and rL,rK,rL”; not 
in this case, however, since 16 must wait five 
cycles after the issuing of 15 to read rL. Fig¬ 
ure 2g shows the situation after 16 is sched¬ 


uled at time = 9 and Figure 2h shows the 
final code. 

It is important to keep in mind that the 
code generated by this scheduling process 
does not generally execute “stand alone”; 
rather, the code is probably a fragment of 
a larger program. Thus, measures must be 
introduced to ensure that hazards are not 
produced as program fragments are pieced 
together. Figure 3 (a-b) shows a program 
fragment and flow graph, respectively, to 
illustrate the problem. A common 
approach to reorganization (called local 
reorganization) is to reorganize each of the 


basic blocks of a program (that is, maximal 
single-entry, single-exit fragments) and to 
glue the resulting pieces together. Figure 3c 
shows possible reorganizations of each of 
the basic blocks of the fragment considered 
independently. Although each reorganized 
fragment executes correctly if independent, 
the sequential execution of the code will 
not produce the desired result. 

To see this, consider the code sequence 
executed when Block2 and Block3 are con¬ 
catenated, as shown in Figure 3d. Note that 
17 (at the top of Block3) will read the 
wrong value, that is, not the one produced 
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Figure 3. (a) A program fragment, (b) Flow graph of the program fragment, (c) Basic 
blocks of the program fragment, (d) Code sequence when Block2 and Block3 are con¬ 
catenated. (e) Code sequence of Block3 after repacking with respect to 14. (f) Code 
sequence of Block3 after repacking with respect to Block2-Block3 boundary. 


by 14. A few moments of reflection will 
reveal that this problem may be solved by 
repacking the blocks, as shown in Figure 
3e. (18' is ‘ ‘add rl0,r8,rl0. ”) Again, incor¬ 
rect execution results since now we have a 
load (16) immediately following another 
load (14). Figure 3f gives a packing that is 
legal with respect to the Block2-Block3 
boundary; whether it is completely legal 
depends on how the block-containing 
instruction XI is packed. 

Again, two problems arise when “piec¬ 
ing” the blocks together—a data sequenc¬ 
ing problem and a resource problem. Only 


trivial solutions have been obtained for the 
resource problem. The most common solu¬ 
tion is not to allow resource utilization to 
propagate beyond block boundaries. In 
this case, such a strategy requires that a No- 
Op be appended to the code for Block2. 
This results in a wasted cycle if the instruc¬ 
tion at the top of the next block is not a 
load. This problem can be especially 
thorny if long resource utilizations are 
needed, such as for a nonpipelined 
floating-point coprocessor. This is dis¬ 
cussed in greater detail in a later section. 

Two solutions are possible when consid¬ 


ering interblock data delays. The simple 
approach is not to allow data delay propa¬ 
gation as above. However, the data delay 
problem is significantly simpler than the 
resource utilization problem. This occurs 
because in the data delay problem it is only 
necessary to determine the maximum delay 
that is carried into the block by any write; 
in the resource utilization problem we 
would have to find all possible resource 
“starting states” for the block. This could 
be very difficult if resources are consumed 
for long periods. However, it is straightfor¬ 
ward to determine the maximum amount 
of delay propagated into any block. But, 
applying this information naively can 
result in a very poor code. 

Figure 4a shows the flow graph for a pro¬ 
gram fragment in which the two upper 
blocks (Blockl and Block2) both propa¬ 
gate delay into the lower block. As 
depicted, Blockl brings four units of delay 
into Block3, whereas Block2 brings in only 
one unit. Assume that Block2 is the textual 
predecessor of Block3 in the program; that 
is, Block3 is entered from Block2 if the 
branch in Block2 is not taken. A simple 
solution to this problem is shown in Figure 
4b. The problem here is that, if Block3 is 
entered mostly from Block2, then three 
wasted cycles have been introduced most of 
the time. The solution of Figure 4c would 
be a good one if Blockl were the textual 
predecessor of Block3 instead of Block2. 
Figure 4d potentially introduces additional 
delay on the Blockl-Block4 path. This 
might be costly depending on execution 
frequencies. The solution in Figure 4e may 
not even be implementable for an architec¬ 
ture with a branch delay of more than 
three, which is typical in some GaAs sys¬ 
tems. The solution of Figure 4f, copying 
the destination block, is the most general. 
However, it increases the size of the pro¬ 
gram. Indeed, in order to save the most 
cycles, many of Block3’s descendants must 
be copied as well. If the fragment shown is 
part of a loop and if an instruction cache 
is employed, then the effectiveness of the 
cache may be reduced by the larger code 
size. Thus, no completely effective solution 
for this problem has been formulated. 
More work is needed in several areas. 

When write delays become several cycles 
in length, which is the case in GaAs 
microprocessors, it is frequently the case 
that typical blocks do not contain a suffi¬ 
cient number of unrelated instructions to 
cover over the timing delays. Consequently, 
too many No-Ops are required to be 
inserted, and efficiency decreases dramat¬ 
ically. In such cases, the efficiency of the 
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reorganization process can frequently be 
increased by finding instructions from 
other blocks to fill in the “holes.” Con¬ 
sider the fragment shown in Figure 5a. An 
extraordinarily good code generator might 
provide the code shown in Figure 5b. Here, 
we have assumed that a marginal index vec¬ 
tor M is available and, further, that the row 
stride of the array is available in register 


rSTRIDE. The column stride size is equal 
to one. 

Focusing on the T-block and the Y-block, 
Figure 5c shows how the blocks might be 
compacted individually; the code quality 
is not very impressive. Figure 5d shows how 
the code can be improved by pushing 
instructions from the X- and Y-blocks into 
the T-block. The code improves by 17 per¬ 


cent. It is somewhat biased in favor of 
executing along the T-X path instead of the 
T-Y path. If we bias completely in favor of 
the T-X path, that is, if we assume that its 
probability of being executed is far greater 
than for the T-Y path, then we obtain the 
code of Figure 5e. This code is optimal with 
respect to execution of the T-X path. How¬ 
ever, Figure 5f gives code that executes the 
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Figure 5. (a) A program fragment in 
HLL. (b) Assembly code sequence of the 
program fragment generated, (c) Code 
sequences of T-block and X-block. (d) 
Improved code sequence of the program 
fragment by moving instructions from X- 
and Y-blocks into T-blocks. (e) Optimal 
code of the program fragment with 
respect to the T-X path, (f) Optimal code 
of the program fragment with respect to 
the T-Y path. 


if A[l,l]>0 then 
B*-(A[I,J + 1]-A[I,J-1]) 
else 

B*-(A[I + 1 ,J] - A[l - 1 ,J]); 


T1: load M[rl],r1 
T2: add rl,r1,r2 
T3: load A[r2],r2 
T4: bge #0,r2,Y1 

XI: add #1,rJ,r3 
X2: add r1,r3,r3 
X3: load A[r3],r3 
X4: add #- 1,rJ,r4 
X5: add r1,r4,r4 
X6: load A[r4],r4 
X7: sub r3,r4,rB 
goto JOIN 

Y1: add rSTRIDE,rJ,r5 
Y2: add r1,r5,r5 
Y3: load A[r5],r5 
Y4: sub rJ,rSTRIDE,r6 
Y5: add r1,r6,r6 
Y6: load A[r6],r6 
Y7: sub r5,r6,rB 
goto JOIN 


(a) (b) 


T1: load M[rl],r1 
XI: add #1,rJ,r3 
X4: add #- 1,rJ,r4 
Y1:addrJ,rSTRIDE,r5 
T2: add rl,r1,r2 
Y4: sub rJ,rSTRIDE,r6 
T3: load A[r2],r2 
X2: add r1,r3,r3 
X5: add r1,r4,r4 
Y2: add r1,r5,r5 
T4: bge #0,r2, Y2; label 

X3: load A[r3],r3 
No-Op 

X6: load A[r4],r4 
No-Op 
No-Op 
No-Op 

X7: Sub r3,r4,rB 


T1: load M[rl],r1 
XI: load M[rl],r1 
X4: add #- 1,rJ,r4 
Y1: add rJ,rSTRIDE,r5 
T2: add rl,r1,r2 
X2: add r1,r3,r3 
T3: load A[r2],r2 
X5: add r1,r4,r4 
X3: load A[r3],r3 
Y4: sub rJ,rSTRIDE,r6 
X6: load A[r4],r4 
T4: bge #0,r2,... 

No-Op 

No-Op 

X7: sub r3,r4,rB 


not very meaningful 


(d) (e) 


T1: load M[rl],r1 
No-Op 
No-Op 
No-Op 

T2: add rl,r1,r2 
No-Op 

T3: Load A[r2],r2 
No-Op 
No-Op 
No-Op 

T4: bge #0,r2,Y1 


XI: add #1,rJ,r3 
X4: add #- 1,rJ,r4 
X2: add r1,r3,r3 
X5: add r1,r4,r4 
X3: load A[r3],r3 
No-Op 

X6: Load A[r4],r4 
No-Op 
No-Op 
No-Op 

X7: sub r3,r4,rB 


(0 


T1: load M[rl],r1 
XI: add #1,rJ,r3 
X4: add #- 1,rJ,r4 
Y1: add rJ,rSTRIDE,r5 
T2: add rl,r1,r2 
X2: add r1,r3,r3 
T3: load A[r2],r2 
X5: add r1,r4,r4 
X3: load A[r3],r3 
Y4: sub rJ,rSTRIDE,r6 
X6: load A[r4],r4 
Y2: add r1,r5,r5 

Y5: add r1,r6,r6 
T4: bge #0,r2,... 

X7: sub r3,r4,rB 


(f) 
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T-X path just as fast, and executes the T-Y 
path two cycles faster. Interestingly, even 
code so biased in favor of one execution 
path is actually optimal, as long as the 
probability of the T-X path is at least as 
large as the probability of the T-Y path. 

Although this example shows only 
movement of instructions between adja¬ 
cent blocks, it is possible to look arbitrar¬ 
ily far ahead in the program to find ‘ ‘ free’ ’ 
instructions. Details of this procedure may 
be found in Linn. 19 It should be noted, 
however, that moving instructions around 
can lead to decreased execution times, and 
also to potentially large increases in pro¬ 
gram size. Figure 6a shows a program frag¬ 
ment where it is determined that 
instruction I in block B will be moved to 
block A. While faster execution may result 
along the A-C1-C2-C3-B path, the pro¬ 
gram must be modified as shown in Figure 
6b, in order that the semantics of the pro¬ 
gram be preserved. 

As a final note, the reorganization pro¬ 
cedure outlined here is frequently utilized 
as a post-pass activity that is initiated only 
after code generation and register alloca¬ 
tion are completed. In fact, we may expect 
that very good code may be obtained only 
when the register assignment phase has the 
knowledge of what shape the reorganized 
code is likely to have. The PL.8 compiler 
for the IBM 801 actually performs reor¬ 
ganization both before and after the reg¬ 
ister assignment phase. In this way, it tries 
to avail itself of final code shape informa¬ 
tion during register assignment, and to use 
reorganization to cover any remaining No- 
Ops as well. 


Branch delay fill-in (sequencing hazards): 
compilation candidate. The negative effect 
of branch instructions on the performance 
of pipelined processors has long been 
recognized. In a typical pipelined proces¬ 
sor, the condition evaluation portion of a 
conditional branch is not complete before 
the next sequential instruction is fetched. 
Therefore, if the condition evaluates to 
true, the branch is taken and the instruction 
that was just fetched must be discarded. 

Silicon CISC approaches to the solution 
of the branch delay problem generally rely 
on a hardware mechanism to inhibit the 
execution of the sequential instruction. 
More ambitious techniques involve 
predicting the outcome of the condition 
evaluation and fetching the appropriate 
instruction. A wrong choice here again 
requires the inhibition of the execution of 
an already fetched instruction. 

The silicon RISC design approach relies 
on the concept of delayed branching, 17 a 
technique used widely in microprogram¬ 
ming. In this approach, a fixed number of 
instructions after the branch instruction 
are always executed. In order to preserve 
the correct program execution, only a sub¬ 
set of instructions are legal candidates for 
insertion after branch instructions. For 
example, instructions influencing the out¬ 
come of the condition evaluation cannot 
be moved to a location following the 
branch instruction. Silicon RISC architec¬ 
tures rely on the compiler to move as many 
instructions as possible to positions follow¬ 
ing branches in order to increase perform¬ 
ance, while also preserving the correctness 
of program execution. 


The delayed branching approach is 
appealing for GaAs processor implemen¬ 
tations. First, it reduces hardware complex¬ 
ity because there is no need to inhibit 
instruction execution, undo partial execu¬ 
tions, etc. The advantages of this approach 
for a resource-scarce GaAs processor are 
clear. In addition, the delayed branching 
technique leads to faster execution because 
the reduced hardware complexity reduces 
the basic instruction cycle time, and 
because successfully filled branch delay 
slots cause no performance degradation, in 
contrast to most hardware approaches. 

However, the increased instruction pipe¬ 
line length expected for GaAs processors 
requires increased compiler algorithm 
capabilities. This is because an increased 
pipeline length also increases the branch 
delay, that is, the number of instructions 
always executed after branch instructions. 
The compiler for the SU-MIPS was very 
successful at finding one instruction to 
move behind branches, but performed rela¬ 
tively poorly when attempting to find two 
or more 20 ; therefore, new branch fill-in 
algorithms are very desirable in the GaAs 
environment. 

Branch delay fill-in (sequencing hazards): 
compiler algorithms. As already indicated, 
there are essentially three ways to handle 
sequencing hazards. The first, hardware 
interlocks, is not felt to be reasonable for 
GaAs systems, both because of the addi¬ 
tional hardware required and because the 
inclusion of interlocks has a lengthening 
effect on the basic machine cycle time. The 
second way is an architectural solution that 
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ifB>MAX then {OLDMAX^MAX; MAX-B}; 
if B<MIN then {OLDMIN*-MIN; MIN^B}; 
exit 

(a) 

LO: if rB< = rMAX goto LI 
mov r MAX,rOLDMAX 
mov rB,rMAX 

LI: if rB> =MIN goto L2 
mov rMIN,rOLDMIN 
mov rB,rMIN 

L2: exit 



(b) 

LO: if rB < = rMAX goto LI 

No-Op 

No-Op 

No-Op 

mov rMAX,rOLDMAX 

MOV rB.rMAX 

LI: if rB> = rMIN goto L2 LO: if rB< = rMAX skip 2 

No-Op mov rMAX,rOLDMAX 

No-Op mov rB,rMAX 

No-Op LI: if rB>= rMIN skip 2 

mov rMIN,rOLDMIN mov rMIN,rOLDMIN 

mov rB,rMIN mov rB,rMIN 

L2: exit L2: exit 

(c) 


(d) 


Figure 7. (a) A program fragment in HLL. (b) A possible realization for the program fragment, (c) Code sequence of the program 
fragment with branch delays of three cycles, (d) Code sequence of the program fragment with “conditional skip of n cycles” 
primitive. 


attempts to avoid conditional branches by 
providing other forms of conditional exe¬ 
cution. The third solution is completely 
compiler based, and consists of having the 
compiler attempt to move instructions into 
the branch fill slots (that would otherwise 
have been occupied by No-Ops). This last 
option is the main thrust of this section. 

First, however, let us briefly consider the 
second solution. Figure 7 (a-b) shows a 
code fragment and a possible machine 
realization. Figure 7c shows the code that 
must be used if we assume a branch delay 
of three cycles, which is typical for some 
GaAs systems. Note that even for the case 
where neither of the consequents is 
executed, nine cycles are required for exe¬ 
cution. Figure 7d shows a realization that 
is possible if we assume that a different 
conditional execution primitive—the con¬ 
ditional skip of n cycles—is available. Such 
a primitive is readily implemented. Many 
pipelined computers already contain the 
appropriate mechanisms to support precise 
interrupts. 

Now, we consider the software solution. 
The solution may be further decomposed 
into: (1) the case where moving instructions 
into the branch fill occurs after individual 
blocks of the program are already reor¬ 


ganized and (2) the case where the filling 
occurs during reorganization. Clearly, the 
former solution is utilized in conjunction 
with local reorganization and the latter in 
conjunction with global reorganization. 
Only the local reorganization method is 
discussed here. The global method requires 
wholesale changes in the flow graph of the 
program that are even more sweeping in 
scope than the ones indicated previously 
for global reorganization. The interested 
reader should refer to Linn. 19 

Figure 8 depicts in a general way all the 
different places that the compiler can look 
to find instructions to plug into the branch 
fill. Here, we are assuming that block B is 
the textual successor of block A, so that 
execution proceeds from A to B, if the con¬ 
ditional branch is not taken. Otherwise, 
execution proceeds with block C. All three 
blocks shown can provide instructions to 
be moved into the branch fill following the 
conditional branch. The first place to look 
for such instructions is in block A. It may 
be that instruction An writes neither of R3 
and R5. If so, the order of An and the 
branch can be reversed; in this way, one slot 
of the branch fill is used up. As usual, this 
move can also be a drawback as in the case 
where Cl, say, is data dependent on An. By 


interchanging An with the branch, An 
takes up a unit of the branch delay, but the 
branch does not take up a unit of An’s 
write delay. Such problems can be resolved 
only optimally by considering all possible 
cases; clearly, this is infeasible. 

If the branch delay is more than one, we 
could continue by considering instruction 
A[n -1], A[n -2], etc., in turn until we have 
completely utilized the branch fill. If we do 
not allow ourselves to repack block A, then 
the procedure may terminate without 
utilizing all of the slots, either if block A 
is shorter than the branch delay or if any 
of the instructions write registers utilized 
by the branch. As a last note, we must 
ensure that the branch instruction is not 
scheduled so early that it attempts to read 
register values that have not yet been set. 
This method of obtaining instructions to 
use in the branch fill is called type-1 branch 
optimization in Gross. 14 

Figure 9a shows a block as it might exist 
after reorganization, and Figure 9b shows 
the effects of type-1 optimization. Note 
that the No-Ops shown will not be covered 
using any of the methods based on local 
reorganization. This is one of the advan¬ 
tages of proper global reorganization. For 
type-1 to be truly effective, the local reor- 


82 


COMPUTER 











ganizer should do all that it can to give pri¬ 
ority to instructions that write into registers 
used by the branch instruction (if any) that 
terminates the block the branch instruction 
is part of. In the current example, an addi¬ 
tional No-Op would have been required if 
the reorganizer had chosen to schedule the 
“add r3,...” instruction first in the block, 
since the branch instruction cannot be 
scheduled sooner than three cycles after the 
“ldA[r2]....” Actually, the code generator 
can do a great deal to facilitate the efficient 
removal of sequencing hazards as well. Fig¬ 
ure 10 (a-b) shows a for-loop construct and 
a possible realization. However, the reali¬ 
zation of Figure 10c is superior (note that 
the increment is in the branch fill) since the 
branch condition does not depend on the 
new value of I, and since the increment is 
done in the branch fill. This realization 
cannot actually be specified in most assem¬ 
bly languages since the concept of a 
delayed branch is not supported. 

Referring again to Figure 8, a second 
place to find instructions is from block C. 
This method is called type-2 optimiza¬ 
tion. 14 The conditions under which the 
top instruction under consideration in 
block C can be moved into the branch fill 
are these: 

• The instruction from C must not be 
moved so high in the branch fill that 
a timing hazard would be created. 

• The instruction from C must not vio¬ 
late the resource state. 

• The instruction from C must “fit” 
into the branch fill. This is important 
if multiple instruction lengths are uti¬ 
lized. For example, a two-word 
instruction cannot be moved into the 
last word of the branch fill since this 
will result in incorrect execution if the 



Figure 8. Locations where instructions 
can be found to plug into the branch 


Id A[r2],r1 

Id A[r2],r1 

add r2,#1,r2 

add r2,#1,r2 

add r3,r4,r5 

add r3,r4,r5 

Id B[r2],r6 

Id B[r2],r6 

add #1,r5,r5 

bgt r8,r1,L32 

No-Op 

add #1,r5,r5 

add r2,r5,r5 

No-Op 

bgt r8,r1,L32 

add r2,r5,r5 

(a) 

(b) 


Figure 9. (a) A block of code that may 
exist after reorganization, (b) Effects of 
type-1 optimization on the block of code 
in a. 


branch is not taken. 

• The instruction must not write any 
register that is live at the top of block 

B. 

for i*-1 to 100 do 

mov #1,rl 

mov #1,rl 

Assuming that all of these conditions are 
met, the instruction can be moved into the 
branch fill at the appropriate slot. What 


L: 

L: 

this means is that the text of the instruction 
is moved into the appropriate slot, and the 
branch target address is increased by one. 


add #1,rl 


Consider Figure 11a, where a short frag¬ 
ment is depicted; temporarily we assume a 
branch delay of four cycles. Figure lib 
shows that situation after type-2 optimiza¬ 
tion is performed. In essence, instructions 

Cl and C2 have been moved into the 

(a) 

ble rl,#100,L 

(b) 

bit rl,#100,L 
add #1,rl ;fill 

(c) 


branch fill. Several items are noteworthy. 
First, Cl has been moved to slot A7 instead 
of A6 so that a timing hazard is not 
created. Second, when a No-Op is needed 


Figure 10. (a) A typical for-loop 
loop construct, (c) A modified 


L (b) A possible machine realization for the for- 
realization for the for-loop construct. 
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A1; 

Id A[r1],r2 

A1: 

Id A[r1],r2 

A2: 

mov #1583,r8 

A2: 

mov #1583,r8 

A3: 

Id B[r1],r3 

A3: 

Id B[r1],r3 

A4: 

add #1,r1 

A4: 

add #1,r1 

A5: 

bgt r8,r2,C1 

A5: 

bgt r8,r2,C3 

A6: 

No-Op 

A6: 

No-Op 

A7: 

No-Op 

A7/C1: 

add #1,r3,r4 

A8: 

No-Op 

A8/C2: 

No-Op 

A9: 

No-Op 

A9: 

No-Op 

B1: 

mov #8,r9 

(b) 


B2: 

mov #84,r10 



B3: 

add r2,r2,r11 



B4: 

exit 

A1: 

Id A[r1],r2 



A2: 

mov #1583,r8 

Cl: 

add #1,r3,r4 

A3: 

Id B[r1],r3 

C2: 

No-Op 

A4: 

add #1,r1 

C3: 

add r2,r4,r2 

A5: 

bgt r8,r2,C3 

C4: 

exit 

A6: 

No-Op 



A7/C1: 

add #1,r3,r4 


A8/C2/B1: mov #8,r9 
A9/B2: mov #84,r10 

(c) 


Figure 11. (a) A program fragment in assembly code, (b) Code sequence of the program 
fragment after type-2 optimization, (c) Code sequence of the program fragment after 
type-2 and then type-3 optimization. 


in block C for timing reasons, it must be 
treated as a normal instruction to be 
moved. Last, we could not move instruc¬ 
tion C3 into slot A9, because C3 writes reg¬ 
ister r2, and r2 is live at the top of B. 

Obviously, the last place to find instruc¬ 
tions to move will be in block B. The 
method is called type-3 optimization. The 
conditions to be met are essentially the 
reverse of the ones for type-2 optimization; 
they are 

• The instruction from B must not be 
moved so high in the branch fill that 
a timing hazard would be created. 

• The instruction from B must not vio¬ 
late the resource state. 

• The instruction from B must “fit” 
into the branch fill. 

• The instruction must not write any 
register that is live at the top of block 
C. 


Assuming that all of these conditions are 
met, the instruction can be moved into the 
branch fill at the appropriate slot. How¬ 
ever, the actual movement is a little differ¬ 
ent. Since B is the textual successor to A, 
we may simply remove a No-Op from the 
end of block A for each instruction word 
of B that is to be moved into the branch fill. 
In the current example, Figure 11c shows 
the result of performing type-3 optimiza¬ 
tion after type-2. Table 1 shows the results 
produced by a working reorganizer for 
RCA’s 32-bit GaAs microprocessor using 
the “standard” benchmarks from 
Gross. 14 

Obviously, these methods can be com¬ 
bined in various ways. A particularly 
interesting combination was implemented 
by Gross. 14 The reorganizer for the SU- 
MIPS machine was implemented in such 
a way that only a very small amount of the 
program was resident in memory at any 
particular time. A small symbol table was 


maintained that contained the first few 
instructions of each block that had been 
reorganized. If a backward branch was 
encountered, the branch fill routines could 
utilize the table information to perform 
type-2 optimization. So, type-1, type-2, 
and type-3 reorganization was attempted. 
In the case of a forward branch, the fill rou¬ 
tines had no knowledge of the code at the 
branch target. Thus, only type-1 and type-3 
was attempted. In this way, only the sym¬ 
bol table and approximately one block 
worth of code remained memory-resident 
simultaneously. 

Memory. There are several memory- 
related areas where a compiler can provide 
enormous performance benefits for a 
GaAs processor. As already pointed out, 
GaAs chips will be limited to a relatively 
low transistor count; therefore, on-chip 
processor memory will be small. Because 
of the large signal propagation delays 
between chips, the cost of accessing even 
the fastest off-chip memory will be high. 
Meanwhile, accessing memory at the lower 
levels of the memory hierarchy will intro¬ 
duce extremely large amounts of dead time 
in which the processor will be idled. 

The compiler is able to ameliorate the 
off-chip memory access problem in two 
ways. First, the compiler is able to increase 
the reusability of information at the higher 
levels of the memory hierarchy. Second, the 
compiler is able to overlap program execu¬ 
tion and the fetching of information into 
the higher levels of the memory hierarchy 
(through various prefetching techniques). 
In other words, the compiler should first 
try to minimize the need for off-chip com¬ 
munications, and then should try to over¬ 
lap any remaining communication with 
useful on-chip processing. We next discuss 
how information reuse and overlap may be 
employed at three levels of the memory 
hierarchy: register file, cache, and main 
memory. 

Register file lifetime maximization: compi¬ 
lation candidate In a GaAs processor incor¬ 
porating a register-to-register execution 
model, data loads and data stores are 
extremely costly. Large performance 
degradation may result for two reasons. 
First, a large delay may be directly intro¬ 
duced because of the long-latency off-chip 
environment, especially for data loads. 
Second, the presence of load and store 
instructions increases the program size, 
and this can be expected to result in 
decreased hit ratios at the higher levels of 
the instruction memory hierarchy. 
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Unfortunately, data loads and stores are 
used frequently in silicon processors. For 
typical silicon RISCs, approximately 10 
and 20 percent of all executed instructions 
are data stores and loads, respectively. 17 
Obviously, once data is loaded into the reg¬ 
ister file, it is very desirable to keep it there 
until it is no longer needed. One method of 
increasing the reusability of register data is 
to incorporate a larger register file—a 
method that is feasible in a silicon proces¬ 
sor but one that faces a severe implemen¬ 
tation problem in GaAs because of 
transistor limitations. An alternative 
approach that is more applicable to GaAs 
is to increase the register data reusability 
through compilation techniques. 

Register file lifetime maximization: com¬ 
piler algorithms In order to maximize utili¬ 
zation of the register file on a machine, two 
separate problems must be solved. The first 
of these is the elimination of common 
subexpressions. Clearly, the number of 
data items that must be retained can have 
a large effect on the percentage of such 
items that are found in a register file at any 
given time. Thus, common subexpression 
elimination not only decreases the number 
of instructions executed but also may 
greatly decrease the number of data items. 
Local and global common subexpression 
elimination are now the standard fare of 
optimizing compilers; Aho and Ullman 21 
provide a very good overview of such tech¬ 
niques. 

The second consideration for maximiz¬ 
ing register lifetimes is to determine 
globally a truly good set of items to keep 
in registers at any particular time. This is 
a very difficult problem. The graph color¬ 
ing algorithm of Chaitin 22 provides an 
excellent framework for considering the 
problem and (as mentioned previously) has 
been utilized for the PL.8 compiler on the 
IBM 801 machine. The technique used is to 
convert a program flow graph, decorated 
with “live variable” information and 
‘ ‘ reaching definition’ ’ information, into a 
graph coloring problem. The nodes of the 
graph are the instructions of the program. 
The edges of the graph represent “interfer¬ 
ences” between instructions. Here, two 
instructions are said to interfere if the reg¬ 
ister definitions of the two instructions 
must be live simultaneously. Clearly, if the 
definitions must be live simultaneously, 
then the values computed by the instruc¬ 
tions must not reside in the same register. 

Reorganization is performed before reg¬ 
ister allocation so that the scheduler of the 
reorganizer need not be constrained by a 


Table 1. Percentage of branch fills accomplished by the current reorganizer (static 


count) 




Static count 



Total 

Benchmark 

Total 

branches 

3-fills 

W 

2-fills 

m 

1-fills 

W 

0-fills 

m 

fills 

0%) 

Realmm 

(8x8) 

29 

31.0 

20.7 

13.8 

34.5 

49.4 

Bubble 
(20 items) 

23 

26.1 

17.4 

13.4 

43.5 

42.0 

Cal 

89 

34.8 

13.5 

26.7 

24.7 

52.8 

FFT 

124 

24.2 

17.1 

45.2 

13.7 

50.5 

Weight 

131 

37.4 

17.6 

25.2 

19.1 

57.5 

Puzzle 

84 

8.3 

9.5 

8.3 

73.8 

17.5 

Parse 

81 

23.5 

19.8 

8.6 

48.2 

39.5 

Average 

80 + 

26.5 

16.5 

20.2 

36.8 

44.2 


physical register assignment. Then, the 
resulting program is converted into its 
interference graph. A polynomial time 
approximation for graph coloring is uti¬ 
lized to color the graph since graph color¬ 
ing in general is NP-complete. Essentially, 
the algorithm tries to 32-color the graph 
since the 801 has 32 registers. The method 
used is to reduce the graph by one node at 
a time (utilizing appropriate heuristics for 
picking good nodes) until the graph is 
empty. If the algorithm never encounters 
a node of degree 32 or higher, then the pro¬ 
cedure is quite simple and a reasonable 
coloring assured. Otherwise, it must find 
some way (normally by spilling to memory 
or recomputing the stored value) of reduc¬ 
ing the maximum degree of the graph. 
Note, by the way, that the existence of high- 
degree nodes doesn’t necessarily mean that 
the algorithm will have to spill if some 
color (register) is duplicated. 

Thus, the method provides a very con¬ 
venient framework for considering the 
problem. Note that since the program is 
considered globally, interferences caused 
by live registers at block boundaries that 
are considered in sequencing hazard elimi¬ 
nation can easily be modeled in the graph. 
Essentially the algorithm provides hooks 
for two sets of heuristics: (1) the choosing 
of a node to color and the color to use and 
(2) the choosing of a node to spill/recom¬ 
pute when coloring has failed. The 
algorithms utilizing the heuristics outlined 
in Chaitin 22 are reported to provide excel¬ 
lent results for the 801 (32 registers), as well 
as for the IBM S/370 architecture with 
only 16 registers. Register files of size 16 
seem to be well within the capabilities of 
today’s GaAs; thus, work in this area 


should concentrate on utilizing this evolv¬ 
ing framework for register allocation. 

Register file prefetching: compilation can¬ 
didate. As discussed in the previous section, 
data loads and stores are both costly and 
relatively frequent. In the previous section, 
techniques were described that reduce the 
need to access information not contained 
in the register file. A second approach for 
reducing the negative effects of data loads 
is to overlap data loads with the execution 
of other useful operations. 

As already indicated, RISC processors 
allow a compiler optimization to be uti¬ 
lized that is not available to compilers for 
CISC processors. As discussed earlier in 
this article, RISC compilers can overlap the 
latency associated with data loads with the 
execution of useful instructions through a 
technique known as load fill-in. Note that 
the load fill-in optimization may be viewed 
as a technique for moving useful instruc¬ 
tions into the fill-in slots after the data load 
instruction. Our viewpoint of this optimi¬ 
zation as a prefetching technique is based 
on the observation that the data load 
instruction is advanced, so that the initia¬ 
tion of the load request is performed before 
the data is actually required by a succeed¬ 
ing instruction; that is, it is prefetched. 
Because the load fill-in delay will likely be 
longer for a GaAs processor, the effective¬ 
ness of the fill-in implementation will have 
a more pronounced effect on performance. 

Just as the increased branch delays for 
GaAs processors introduce the need for 
fundamental compiler algorithm improve¬ 
ments, so too do the longer load delays. 

Register file prefetching: compiler 
algorithms From the point of view of the 
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compiler, the load delay is treated identi¬ 
cally as other write delays. In the previous 
example of Figures 1 and 2, the load 
instructions were assumed to have a write 
delay of four cycles. In this example, the 
reorganizer is only partially successful in 
filling instructions into the load fill-in 
slots. The essential reason here is the rela¬ 
tively small number of instructions under 
consideration and therefore the small num¬ 
ber of free instructions at any particular 
instant. The example of Figure 5 shows that 
significantly better performance should be 
obtained when global reorganization is uti¬ 
lized. However, we do not have sufficient 
experience yet with global reorganization 
versus local reorganization for architec¬ 
tures with long delays as in GaAs. 

An important point should be made 
here about the usefulness of this technique. 
If no interlocks are provided (as is pro¬ 
posed), the write delay utilized will have to 
be the maximum possible delay. For exam¬ 
ple, if a load resulting in a cache hit has a 
write delay of three, whereas a miss causes 
a write delay of 12, then either the reor¬ 
ganizer must always assume a delay of 12 
or the machine must take an interrupt (its 
only form of interlock) on a cache miss. If 
misses are frequent and the interrupt han¬ 
dling sequence is longer than 12 cycles, 
considerable performance will be lost. The 
801 machine, conversely, supports inter¬ 
locks. The 801 compiler in the above situ¬ 
ation would attempt to pack between three 
and 12 instructions into the load fill 
depending on what instructions might be 
free. Considerable experimentation needs 
to be initiated to determine whether the 
performance loss of supporting interlocks 
is greater than the performance gained by 
assuming (and frequently hitting) mini¬ 
mum write delays. 


Cache lifetime maximization: compilation 
candidate. Cache memories are used in vir¬ 
tually all high-performance silicon com¬ 
puter systems. Caches have proved to be 
successful at providing nearly cache-level 
access speeds while introducing little to the 
overall system cost. The reason for the suc¬ 
cess of cache memories is that they success¬ 
fully exploit the locality of memory 
references. 23 Memory reference address 
patterns are not completely random. Once 
a memory location is accessed, there is a 
high likelihood that the same address will 
be reaccessed a short time later (temporal 
locality). Also, when a memory location is 
accessed, there is a high probability that the 
nearby addresses will be accessed soon 


(spatial locality). However, note that the 
level of locality will decrease after the code 
motion related to the global code optimi¬ 
zation, which has been advocated here in 
several places. 

The performance level of a cache mem¬ 
ory depends considerably on the cache hit 
ratio, that is, the fraction of memory refer¬ 
ences accessing a value actually in the 
cache. The penalty for cache misses, in 
terms of instruction cycles, will be higher 
for a GaAs processor than for a silicon 
processor. This is due to the relatively long 
access times of main memory because of 
extremely long propagation delays, if 
GaAs memory chips are used, or because 
of slower raw memory access speed, if sili¬ 
con memory chips are used. Because of the 
lower density of GaAs chips, GaAs caches 
will likely be smaller than caches for sili¬ 
con processors. Since cache memory 
capacity is the single most important 
parameter to influence the hit ratio, 24 
GaAs processor systems may experience 
significant performance degradation from 
cache misses. 

One way to reduce cache miss ratios is to 
ensure that, once a block of data is loaded 
into the cache from main memory, the 
maximum usage of the block is achieved 
before it is restored to main memory. That 
is, we would like to maximize the reusabil¬ 
ity of the cache block information. There 
are two ways in which a compiler can be 
helpful. 

In the first approach, the compiler 
works with the runtime block replacement 
policy by appropriately grouping together 
instructions and data. The goal here is to 
increase the correlation between the tem¬ 
poral and spatial localities of reference of 
particular information. In other words, the 
compiler increases the degree of spatial 
locality in programs. Consequently, all the 
information used within a short time span 
should also be contained within the mini¬ 
mum number of cache blocks. Note, that 
one of the parts of the Parafrase* system 
actually revises the locations of data in 
memory, to increase the locality. 

A second approach is to allow the com¬ 
piler to override the runtime block replace¬ 
ment policy whenever it detects a more 
optimal replacement sequence. Because a 
runtime replacement policy must operate 
with little overhead, it must necessarily be 
somewhat primitive. A common runtime 
approach is the least recently used (LRU) 
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method or an approximation. As its name 
indicates, the LRU approach replaces the 
block having the earliest time of last use. 
Since a compiler has a view of the entire 
program, and presumably more time to 
implement its replacement algorithm, it 
should also select better blocks for 
replacement. 

Because of the relatively small penalty 
for cache misses in a silicon processor 
implementation, the motivation for com¬ 
piler algorithm advances to improve cache 
performance (of the type discussed above) 
is also relatively small. However, one of 
the first-generation silicon RISCs, the 
IBM 801, incorporated special instructions 
to allow its compiler to override the run¬ 
time cache placement policy whenever it 
detected a better policy. 13 Still, in general, 
current compiler algorithms for improving 
cache performance are at a primitive state 
and need improvement for a better utiliza¬ 
tion of GaAs technology. 


Cache lifetime maximization: compiler 
algorithms. For any but the most trivial pro¬ 
grams, it is very difficult for a compiler to 
determine the set of blocks in memory at 
any given time; thus, it is difficult for the 
compiler to give much assistance to the 
replacement policy. There are a few cases, 
however, where the compiler can assist in 
identifying good candidate blocks to be 
replaced. Consider a flow graph of the type 
shown in Figure 12a. Here, we see a loop 
and an if-test inside. If the compiler knows 
(from traces, for example) that the conse¬ 
quent of the if statement is not executed a 
very large percent of the time, then the 
compiler might consider instructing the 
cache that Block B is an excellent candidate 
for the next replacement in its cache set. 
Figure 12b shows a program with two 
loops. The compiler here would instruct 
the cache to mark for replacement appro¬ 
priate cache frames from the first loop. In 
this way, runtime routines shared between 
the two loops may be spared an inoppor¬ 
tune replacement. Marking data (as 
opposed to instructions) for replacement 
is a much more difficult task. 

The effects of packing the program in 
such a way, as to minimize the number of 
cache frames active at any particular time, 
could be a very effective way to obtain an 
apparent increase in cache size. For instruc¬ 
tion blocks, this essentially amounts to 
ensuring that labels in the program fall on 
appropriate addresses relative to the start 
of memory frames. Consider the program 
of Figure 13. Here, we have a program with 
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Figure 12. (a) A flow graph with a loop 
and an if-test inside, (b) A flow graph 
with two loops. (Note that the diamond 
box represents a test block.) 



Figure 13. Code sequence for the loops 
from Figure 12. 


an outer loop and an inner loop. The 
brackets indicate cache/memory frames. 
Note that because of the alignment of the 
program , the inner loop actually requires 
three cache frames instead of two. Thus, 
the probability that one of the frames will 
be displaced is 50 percent higher than if II 
were aligned on a cache frame boundary; 
for inner loops with complicated flow 
graphs the percentages can increase dra¬ 
matically. 

Similar considerations could be given to 
data frames. For example, one could deter¬ 
mine the influence of the affinity of access 
that various data objects display. The var¬ 
ious components of a stack frame, or the 
fields in a frequently accessed record type, 
could be subjected to such analysis. In the 
case of record fields, the fields would be 
rearranged so that items that are accessed 
similarly (in time) would be packed 
together into the fewest possible frames. 

Their history has been that cache mem¬ 
ories have enjoyed high hit rates and/or 
low penalties for misses. Such assumptions 
are not necessarily valid for GaAs architec¬ 
tures; nevertheless, little work has been 
reported in attempting to apply such 
software-controlled cache techniques. Our 
opinion is that these techniques are 
unlikely to produce large improvements in 
the efficiency of GaAs systems; however, 



(b) 


there have been no experimental valida¬ 
tions of these “engineering intuitions.” 

Cache prefetching: compilation candidate 
As discussed in the previous section, mem¬ 
ory accesses resulting in cache misses can 
reduce the performance of GaAs proces¬ 


sor systems significantly. In the previous 
section, we described compilation tech¬ 
niques for reducing the number of replace¬ 
ments (of cache blocks) containing data 
that will soon be needed by the processor. 
There are compilation techniques for 
reducing the cache miss rate by loading 
information into the cache just before it is 
required by the processor, in parallel with 
on-going on-chip processing (another 
approach is to try to reduce the negative 
effects of cache miss). 

Runtime techniques for prefetching 
cache blocks must be simple to minimize 
overhead delays. One example approach is 
the “one block lookahead” method. 25 In 
one variation of this approach, the hard¬ 
ware ensures that block i + 1 is in the cache 
when block i is accessed. If block i + 1 is 
not resident in the cache, it is then fetched. 
As with most runtime memory manage¬ 
ment techniques, this approach yields 
benefits because it successfully exploits the 
locality of references, in this case spatial 
locality. A compiler-based approach is 
advantageous because it is not forced to 
rely exclusively on locality exploitation. 
The compiler has a much broader view of 
program behavior than the runtime mech¬ 
anism just mentioned. For example, the 
compiler can detect occurrences of branch 
instructions well before a runtime mecha- 
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nism can. With the ability to correctly pre¬ 
dict the cache blocks to prefetch, compiler 
algorithms offer a potentially large per¬ 
formance enhancement capability. 

Again, the smaller reward for compiler- 
based cache block prefetching in the silicon 
environment has limited the motivation for 
developing sophisticated compiler 
algorithms. In GaAs processor implemen¬ 
tations, however, increased potential 
benefits encourage algorithm advances to 
exploit this compiler optimization oppor¬ 
tunity. 


Cache prefetching: compiler algorithms. 
While it is possible to allow the software to 
completely manage the cache in the same 
way that it manages its registers, such a sit¬ 
uation is unlikely to be completely accept¬ 
able since a well-understood source of 
parallelism may be forfeited in the process. 
Further, the amount of process state that 
must be saved and restored at a context 
switch would be greatly increased. Thus, 
caches will probably continue to support 
primitive demand policies such as LRU. 
However, a compiler armed with trace 
probabilities could likely do a superior job 
to such crude policies as “fetch / + 1”. At 
the entry to any basic block of code, the 
compiler would insert cache management 
instructions to ensure that successive cache 
frames are available as needed. Addition¬ 
ally, the compiler could ensure that the 
cache frames, where likely successor blocks 
are situated, fall in different cache frames, 
compared to the last few frames of the cur¬ 
rent block, in order to be able to prefetch 
these as well. Thus, program addresses 
could be assigned by picking likely traces 
through the program and ensuring that 
addresses likely to be executed in the same 
time frame are allocated in different cache 
sets. As a special case, prefetching across 
subroutine boundaries should be relatively 
straightforward, particularly if cache set 
sizes greater than one are supported. As on 
the 801, special instructions should be 
provided for prefetching; the ability to 
prefetch several consecutive blocks simul¬ 
taneously could be useful. However, high 
interrupt densities might reduce the posi¬ 
tive effects of this process. 

Similar methods could be applied to 
data frames. A very important conse¬ 
quence of moving the prefetching to soft¬ 
ware instead of hardware is that the 
number of incorrect prefetches should be 
greatly decreased. For example, several 
authors have noted that branches occur 
very often in compiled code. Thus, the 


“fetch i + 1” strategy will end up fetching 
a great many frames that are not utilized. 
For GaAs caches, this will mean that a 
reasonable number of frames are discarded 
before the end of their useful lifetimes. 
Another important consideration is that a 
software-based solution for data prefetch¬ 
ing can take array strides into account, 
whereas this is not possible with any sim¬ 
ple implementation of the “fetch / + 1” 
method. 

As before, the historic assumptions for 
cache environments have created a situa¬ 
tion in which little work has been done in 
this area. Our opinion is that instruction 
prefetching may likely result in excellent 
performance improvements, especially 
when the prefetch instructions are placed 
where No-Ops would have been. No exper¬ 
imental validation of these opinions has yet 
surfaced. 


Main memory lifetime maximization and 
prefetching: compilation candidate. The size 
of the main memory of computer systems 
has increased enormously in the last several 
years as silicon technology has continued 
to develop. The main memory for a GaAs 
processor will very likely consist of silicon 
memory parts for at least two reasons. 
First, because of the much lower density of 
GaAs parts, an enormous number would 
be required in order to implement a large 
memory. The propagation delay between 
the processor and furthest memory chips 
would be very large, especially if they are 
on separate boards. Therefore, the access 
delay for a GaAs main memory would 
probably not be significantly better than 
for a silicon main memory. Second, the 
large cost of such a great number of rela¬ 
tively expensive GaAs chips would be pro¬ 
hibitive for all but the most demanding 
applications. 

In some applications requiring the capa¬ 
bilities of a GaAs processor, there may be 
no need for a backing store memory such 
as a magnetic disk. However, in those 
applications that do use a backing store, 
the cost of accessing it is very high. Mag¬ 
netic disk access typically requires several 
milliseconds; therefore, a GaAs processor 
would waste a significant amount of time 
waiting for disk accesses to complete, or 
perhaps lose performance because of fre¬ 
quent context switching overhead. 
Minimizing the number of disk accesses is 
certainly important for a silicon processor 
system, but is more important for a system 
that hopes to exploit the speed advantages 
of a GaAs processor. 


Systems utilizing backing store memo¬ 
ries generally transfer large blocks of data 
between the backing store and main mem¬ 
ory. These blocks are usually referred to as 
either pages or segments. Pages are of fixed 
size, while segments correspond to logical 
entities within high-level language pro¬ 
grams, such as procedures or data struc¬ 
tures, and are of variable size. Main 
memory then is divided into pages, seg¬ 
ments, or perhaps a combination of the 
two. 

Paralleling the previous discussion of 
cache memories, what we desire here are 
compilation techniques that decrease the 
frequency of backing store access or, alter¬ 
natively, that increase the reusability of 
data in main memory (another approach 
is to try to decrease the negative effects of 
backing store access). There are two 
approaches, which are similar in their basic 
goal, but different in the algorithms to be 
employed. 

In the first approach, the compiler re¬ 
structures instructions and data to increase 
the effectiveness of the runtime page/seg¬ 
ment replacement policy. The compiler 
attempts to place into the same page, or set 
of pages, all the information that will be 
used within nearby intervals of time. 

In the second approach, the compiler 
selectively overrides the runtime page/seg- 
ment replacement policy when it discovers 
more effective replacement strategies. For 
reasons noted in the previous section, the 
compiler can be expected to implement a 
better replacement policy than a runtime 
mechanism can. 

Note that the above two approaches can 
be combined, which has been done in some 
systems. 

Memory requests requiring backing 
store access, for example, magnetic disk, 
can penalize a GaAs processor severely. 
Instead of using methods for keeping use¬ 
ful data in main memory, there are 
compiler-based techniques for loading 
information into main memory before the 
processor requires it. 

Main memory runtime prefetch strate¬ 
gies must be very simple for the same rea¬ 
son discussed for cache runtime prefetch 
techniques. However, because the compiler 
can sometimes predict the usage of instruc¬ 
tions and data, it seems to offer the poten¬ 
tial for a highly successful prefetch 
mechanism. A compiler-based algorithm 
may be used for either a paged or seg¬ 
mented memory. For example, if a proce¬ 
dure is to be eventually called from within 
the currently executing procedure, the 
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instructions for the new procedure may be 
prefetched. The effectiveness of this 
approach is severely limited by the 
extremely long seek times and access laten¬ 
cies of backing stores. Because context 
switches are normally performed when a 
page/segment fault is encountered, a 
prefetch algorithm is less profitable here 
than in the case of cache misses. Neverthe¬ 
less, the fast instruction cycle times of 
GaAs processors create a higher potential 
performance enhancement if adequate 
compile-time algorithms could be found. 

Main memory lifetime maximization and 
prefetching: compiler algorithms. This poten¬ 
tially Significant research area has been 
essentially abandoned in recent years. As 
memory sizes have increased, the notion 
that virtual memory systems utilizing 
demand paging and simple runtime 
replacement policies has become firmly 
rooted. Just as main memory caches have 
decreased the apparent access time of main 
memory, so too has disk caching been 
employed to reduce the effective access 
time of disk memories. Thus, the environ¬ 
ment has been that there was no payoff for 
techniques involving program restruc¬ 
turing. 

Thus, the problems and solution possi¬ 
bilities offered here have been essentially 
ignored for a decade. But, the new pro¬ 
grams in GaAs may provide an impetus for 
renewed research, and Hatfield and Ger¬ 
ald, 26 Hatfield, 27 Smith, 28 Ferrari, 29 and 
Trivedi 30 are elements of a good starter set 
of references in this area. 

Multilevel cacheless memory systems: com¬ 
pilation candidate As discussed earlier, 
cache memories of small capacity greatly 
improve computer system performance 
because they successfully exploit temporal 
and spatial localities of reference. The 
standard caching mechanism is imple¬ 
mented in hardware within the memory 
system. However, a hardware caching 
mechanism has two characteristics that 
reduce its desirability for implementation 
in a GaAs processor system. First, cache 
memories require a significant amount of 
hardware overhead for their implementa¬ 
tion. This extra hardware may introduce 
signal propagation delay problems because 
it causes other hardware elements to be 
located farther from the processor, perhaps 
even to another board. Second, caches 
generally add a delay to the memory access 
time to determine whether the desired 
block is present in the cache, even when a 


translation lookaside buffer is used. This 
delay may be effectively eliminated if it is 
implemented as a stage in a memory pipe¬ 
line. Yet this adds another instruction to the 
branch and load fill-in delays and is, there¬ 
fore, not desirable either. 

Registers are advantageous in that they 
require no caching overhead. Extending 
the register file concept to main memory 
may prove to be advantageous in a GaAs 
processor system, and we will now describe 
some of the advantages of this technique. 

First of all, just as registers are accessed 
directly through their physical address, so 
too are the main memory locations in a 
cacheless system; therefore, a logical-to- 
physical address translation is not required 
for main memory accesses. 


Extending the register file 
concept to main memory 
may prove to be advanta¬ 
geous in a GaAs proces¬ 
sor system. 


The cacheless memory system exploits 
the spatial and temporal localities of refer¬ 
ence in the same manner as a cache-based 
system. The major difference is that the 
cacheless memory system with the same 
hardware complexity as a cache-based sys¬ 
tem will have more of its hardware devoted 
to instruction and data storage. 

Finally, the cacheless memory system 
requires the compiler to explicitly load 
instructions and data into main memory in 
the same manner that a conventional com¬ 
piler must load data into the register file of 
a conventional processor. A cacheless 
memory system allows the compiler to per¬ 
form load fill-in on these loads. This is in 
contrast to a hardware-imposed pipeline 
suspension when a cache miss occurs in a 
cache-based system. 

The cacheless memory system concept 
requires a sophisticated compiler in order 
to take advantage of the additional oppor¬ 
tunities for optimization, as well as to 
implement the replacement policy. 

Multilevel cacheless memory systems: com¬ 
piler algorithms While we are not aware of 
any ongoing research in this area, there are 
essentially two methods that could be 
employed here, utilizing existing 
algorithms and knowledge. Both of these 
techniques attempt to map the problem 
onto the register allocation problem stati¬ 
cally. The first approach would operate in 


the manner of naive but not trivial regis¬ 
ter allocators that assign the most fre¬ 
quently used items in a program fragment 
(say, a procedure) to permanent locations 
in the fast memory and use the rest of the 
faster memory as a buffer for the rest of the 
memory locations. If the faster memory 
were large, the overhead of the spilling and 
reloading operations could be managed. 
Such an algorithm would be simple to 
implement and would be very effective if 
the code being executed changed very 
slowly. Data access within a program 
beyond what could be held in a reasonable 
register file does not seem to be accessed 
along nearly so regular patterns. For many 
situations, the program would be better off 
just accessing the data from the slower 
memory without the overhead of first 
fetching it into the fast memory. 

A second alternative (and others are pos¬ 
sible) is to use a more sophisticated algo¬ 
rithm such as the graph coloring algorithm 
mentioned earlier to find a reasonable 
mapping. The code would be divided into 
frames. Frames would be said to interfere 
if they need to be in fast memory at the 
same time for fast execution. Again, graph 
coloring could be used to find an appropri¬ 
ate static mapping between the program 
frames and physical frames in the fast 
memory. The compiler then adds code to 
ensure that frames are in memory before 
they are executed. This technique could 
also be applied to moderate-size data sets, 
that is, the size of an activation record. 
However, for large data sets (i.e., the atoms 
in a Lisp program) the computation would 
not be feasible. 

Finally, dynamic techniques could also 
be employed. However, we are not aware of 
any ongoing research in this area. 

Memory-initiated instruction prefetching: 
compilation candidate Instruction fetching 
and instruction execution are fairly 
independent operations. Instruction fetch¬ 
ing is mainly a memory system function, 
with some processor participation, while 
instruction execution is a processor respon¬ 
sibility. Because it makes sense in a GaAs- 
type environment, where communication 
costs are large, to perform locally as much 
work as possible, it may be useful to move 
the instruction fetching mechanism local 
to the main memory, when possible. A 
similar approach was studied in Patterson 
et al. 31 

An instruction memory system has all 
the information necessary to calculate the 
address of the next instruction (program 
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counter) to be executed, independently of 
the processor in most cases. In sequential 
execution, the PC is simply the current PC 
incremented by one. On unconditional 
branches, the memory system can perform 
the destination address calculation, if a 
PC-plus-displacement operation is used. 
On conditional branches, however, the 
memory system must rely on help from the 
compiler to better decide which instruction 
to fetch. The compiler can indicate its 
choice by setting a bit in the instruction. 
However, it is not until the processor has 
completed the condition evaluation that 
the memory system knows whether it has 
fetched correctly. 

The advantage of such an approach, 
beyond the benefits of parallel address cal¬ 
culations, is that it eliminates the delay for 
propagation of the address from the 
processor to memory. This delay may rep¬ 
resent one entire pipeline stage in a GaAs 
processor. 

Memory-initiated instruction prefetching: 
compiler algorithms. We have assumed 
throughout that the compiler is capable of 
determining appropriate branch probabil¬ 
ities for a program (even though this may 
be quite difficult). Thus, having the com¬ 
piler set a bit in the instruction to predict 
the branch direction is well within current 
technology. Besides this, and filling with 
instructions during the branch fill for con¬ 
ditional branches, the compiler can be of 
little additional benefit here. 


Arithmetic. The use of low-transistor- 
count processors requires the implementa¬ 
tion of hardware-intensive arithmetic units 
external to the processor chip. One such 
candidate for implementation into an 
arithmetic coprocessor is the multiplica¬ 
tion function. Other candidates that may 
be considered include floating-point hard¬ 
ware, etc. 

Complex arithmetic delay fill-in: compila¬ 
tion candidate The use of low-transistor- 
count processors requires the implementa¬ 
tion of hardware-intensive arithmetic units 
external to the processor chip. One such 
candidate for implementation into an 
arithmetic coprocessor is the multiplica¬ 
tion function. Other candidates that may 
be considered include floating-point hard¬ 
ware, etc. The following discussion applies 
to all these cases. 

One problem with coprocessors is the 
long latency involved in providing the 
processor with the result after the proces¬ 


sor requests it. Compilation techniques 
may be used to reduce the negative conse¬ 
quences of such long latencies. 

The compiler-derived benefits here are 
much the same as for the load delay fill-in 
solution. In both cases the compiler 
attempts to keep the processor data path 
busy at the same time that a parallel 
resource is active. However, restrictions 
related to the fill-in contents (instructions 
available for fill in) are different. In a GaAs 
processor implementation, the benefits 
derived are very substantial. First of all, as 
indicated earlier, the limited transistor 
count of GaAs almost certainly dictates 
that some type of coprocessor will be 
implemented. Secondly, the high penalty 
of off-chip communication means that a 
large potential performance gain will be 
achieved through a sophisticated compiler 
implementation. 

As in the case for both the branch fill- 
in and data load fill-in optimizations 
described earlier, current compiler 
algorithms for performing arithmetic 
delay fill-in are ineffective for the longer 
latencies presented by a GaAs processor. 
Compiler algorithm advances are neces¬ 
sary for a greatly improved fill-in capabil¬ 
ity. Note that similar algorithms can be 
applied to I/O. 

Note also that multiple coprocessors 
associated with a single CPU and running 
multiple job streams could result in a good 
utilization of the CPU’s computational 
capability. 

Complex arithmetic delay fill-in: compiler 
algorithms There are actually two problems 
here. In one, the delays involved are on the 
same order as that for a memory load, or 
even a little longer. In the other, delays an 
order of magnitude worse than a memory 
load are encountered. In the first case, the 
methods presented in discussing how to 
deal with timing hazards are quite ade¬ 
quate. For example, an on-chip, off-data- 
path serial multiplier might have a delay of 
five cycles. It is straightforward to deal with 
such delays. 

Conversely, a floating-point coprocessor 
chip might have a delay of 140 cycles for an 
operation such as floating-point divide. In 
such cases, one would like to attempt to 
overlap the computation with as much of 
another part of the program as possible. As 
was pointed out earlier, this type of 
arrangement is beyond the scope of the 
current global reorganization algorithms, 
both because of the long resource utiliza¬ 
tion requirements and because of the fact 


that the current techniques cannot account 
for the delay time utilized by loops. Much 
more work in this area is needed. 

As a last resort, the floating-point 
coprocessor in the above example could be 
treated exactly like an I/O device, and the 
processor’s attention could be switched to 
another process when such operations are 
initiated. 

Strength reduction: compilation candidate 
GaAs processor systems lose much of their 
performance advantage over silicon sys¬ 
tems when performing arithmetic opera¬ 
tions outside of the central processor. In 
fact, using early GaAs technology, a GaAs 
processor may achieve a 5:1 speedup over 
a silicon processor, yet a GaAs arithmetic 
coprocessor only experiences a 2:1 
speedup. 32 

There are two reasons for the relatively 
poor performance of a GaAs arithmetic 
coprocessor. First, the communication 
necessary to perform a coprocessor func¬ 
tion requires more time, in terms of instruc¬ 
tion cycles, for a GaAs processor. Second, 
the severe transistor and area limitations 
typical for GaAs restrict arithmetic func¬ 
tion designs to those utilizing iterative 
approaches rather than parallel 
approaches. 

In order to achieve maximal speedup 
through the use of GaAs technology, the 
frequency of complex operation execution 
should be minimized. There are techniques 
for reducing the frequency of complex 
arithmetic operations. 

Strength reduction: compiler algorithms 
Reduction in strength is achieved by com¬ 
pilers in essentially three ways. 

The first way is to do a special case anal¬ 
ysis on the operands of any particular 
operation to determine if a cheaper imple¬ 
mentation is available. Typical examples of 
this include adding instead of multiplying 
by two (or three), right-shifting instead of 
dividing, etc. 

The second method is the substitution of 
addition for multiplications when the mul¬ 
tiplications are found in a loop and one of 
the operands is proportional to the loop 
index. These are standard compiler fare 
and the appropriate algorithms are found 
in Aho and Ullman. 21 

The third technique for achieving a 
reduction in strength is by caching. Exam¬ 
ples of this include the use of displays for 
stack frame offsets, and the use of mar¬ 
ginal index vectors. In each case, a rela¬ 
tively lengthy computation is replaced by 
a memory reference. For example, a mar- 
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ginal index converts multiplications into 
memory references. In the case of a GaAs 
processor, neither memory nor multiplica¬ 
tions are cheap; thus, this type of optimi¬ 
zation must be used with great care. 

Instruction format. An appropriate 
choice of instruction format will enable the 
employment of different instruction pack¬ 
ing schemes. Instruction packing effec¬ 
tively increases the memory bandwidth, 
which is so desirable in GaAs systems. On 
the other side, it imposes new challenges 
for compiler writers. 

Among a variety of different approaches 
to instruction packing, we have chosen two 
to discuss here. We refer to them as MIPS- 
style packing, 12,14 and transputer-style 
packing. 33 In the first case, two instruc¬ 
tions with short or no immediate fields are 
packed together under the condition that 
no pipeline conflicts will be created. In the 
second case, all instructions are of the same 
8-bit length, and four of them can be 
packed together. In both cases, 32-bit units 
are fetched from the main memory a* one 


MIPS-style packing: compilation candi¬ 
date This type of packing is more efficient; 
that is, more instructions could be packed 
if the instruction format has shorter 
immediate fields. However, shorter 
immediate fields pose certain constraints 
to be overcome by the compiler. Con¬ 
straints are even more severe if we insist on 
single-word instructions, which is espe¬ 
cially important in GaAs systems. 


MIPS-style packing: compiler algorithms. 
The ability to pack multiple instructions 
into a single word complicates the model 
of instructions significantly, but has sur¬ 
prisingly little effect on the required reor¬ 
ganization algorithm (in silicon systems). 
What actually happens is that an instruc¬ 
tion is considered to have several “ver¬ 
sions,” each with its own resource 
utilization and read/write delays. 

With these simple additions, it is clear 
how the reorganization algorithms must be 
modified. First, when we are trying to 
arrive at a set of available candidates, we 
must check to see if any version of an 
instruction fulfills the current resource 
utilization constraints. Next, we must use 
the write delay of the version actually 
scheduled to determine when data depen¬ 
dency successors become data available. 
Last, we must check to ensure that the read 
delay of a version under consideration is 


truly sufficient to allow the data to be 
passed. These additional constraints are 
easily supported within the general context 
of the reorganization algorithm presented 
earlier. A formal presentation of all the 
details is found in Linn. 18 

Transputer-style packing: compilation can¬ 
didate. This type of packing may be very 
appealing for GaAs systems. The instruc¬ 
tions tend to be more primitive compared 
to “traditional” RISC instructions. Con¬ 
sequently, we expect that optimization 
opportunities will increase. 

Transputer-style packing: compiler 
algorithms. Code optimization for 
transputer-style packing should be based 
on the same approaches as discussed 
above. Note that more primitive machine 
instructions potentially offer better code 
optimization capabilities. 

Multiprocessing: compilation candidate 
Communication mechanisms in multipro¬ 
cessor systems deserve very special atten¬ 
tion. Intersystem (interprocessor) 


Instruction packing effec¬ 
tively increases the mem¬ 
ory bandwidth, which is 
so desirable in GaAs 
systems. 


communication delays have to be 
minimized and equalized through both 
hardware design and compiler efforts. 
Compile-time partitioning of complex pro¬ 
grams, into the tasks to be associated with 
different processors, has different require¬ 
ments in GaAs systems. Note that a com¬ 
bined dataflow/reduction approach may 
be better suited for GaAs systems than 
conventional control-flow approaches. 
This is because data are forwarded to 
appropriate destinations when available 
and not when requested. The first method 
is less sensitive to the relatively large inter¬ 
system delays typical of GaAs. 

Multiprocessing: compiler algorithms In 
systems such as these, the compiler and 
operating system work together to simplify 
the execution problems, to make execution 
more efficient. Graph simplification by the 
compiler is required to eliminate as much 
interprocess communications as possible. 
Then the task assignment and scheduling 
must be done with communication costs in 
mind, as well as the usual time and mem¬ 
ory costs. 


Compiler optimizations 
for a real GaAs 
processor 

As part of RCA Corporation’s deep 
involvement in GaAs computer system 
design, an optimizing compiler develop¬ 
ment effort is currently underway. As will 
be seen in this section, many of the optimi¬ 
zations presented earlier are integral com¬ 
ponents of this GaAs processor compiler. 

The original development plan for the 
optimizing compiler was to utilize the 
Stanford U-CODE compiler to produce 
MIPS-style code and to use post-pass reor¬ 
ganization to eliminate the sequencing and 
timing hazards. The U-CODE compiler 
actually performs a great number of the 
traditional compiler optimizations men¬ 
tioned here, including common subexpres¬ 
sion elimination, code motion for loop 
invariant code, tree height reduction for 
expressions, and induction variable elimi¬ 
nation. The decision to use this decompo¬ 
sition was made for commercial rather 
than technical reasons—the customer 
mandated that post-pass reorganization 
would be used. Thus, the benefits of pre- 
and post-reorganization for register allo¬ 
cation were lost. Also, the decision was 
made to utilize the same strategy as the 
Stanford project that had been very 
successful—that is, local reorganization 
and type-1, -2, and -3 branch optimiza¬ 
tion. Because of the longer delays 
involved, an extra pass was required in the 
reorganizer to deal with interblock write 
delays; interblock resource utilization was 
not permitted. 

It is interesting to consider some of the 
characteristics of the prototype architec¬ 
ture to get a feeling for the utility of the 
issues discussed here. The write delay for 
a typical ALU instruction was three cycles 
for a single-length instruction, and four 
cycles for a double-length instruction. 
(Note that two different instruction 
lengths were needed; this is easily handled 
with the resource utilization mechanisms.) 
The branch delay is six cycles; the load 
delay is seven cycles. 

The reorganizer has been implemented 
in LISP and is approximately 2500 lines in 
length. It was implemented in 3.5 months. 
After only limited testing, with hand¬ 
generated input programs, several issues 
have become quite clear. First, local reor¬ 
ganization is not sufficient to find enough 
independent instructions to fill over load 
and branch delays. Interestingly, load 
delays are rarely a problem, because most 
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Table 2. Performance statistics for RCA’s 32-bit GaAs microprocessor. Pre-Op: before code optimization; Post-Op: after 
code optimization. 




No-Ops (%) 


Extrapolated real 



Static 


Dynamic 

execution time* 

(Peak speed 40 MIPS) 

Benchmark 

Pre-Op 

Post-Op 

Pre-Op 

Post-Op 

Pre-Op 

Post-Op 

Realmm 

34.8 






(8x8) 

17.7 

19.1 

10.2 

30.93 

34.16 

Bubblesort _ _ 

(20 items) 

47.1 

25.1 

45.1 

19.9 

21.09 

30.35 

Weight 

— 

14.9 

— 

10.1 


34.72 

Puzzle 

51.4 

31.1 

32.5 

16.3 

24.21 

29.29 


•Assumes an infinitely large cache. 


scalar references can be held in registers. 
For example, in the Bubblesort program, 
there were usually enough instructions 
available to fill in the load delays. In the 
case of branch delays, the story is quite 
different. 

Even after keeping the whole program 
graph resident simultaneously (in virtual 
memory, naturally) so that all of type-1, 
-2, and -3 branch optimizations could be 
attempted, the program was able to fill 
fewer than half of the branch fill slots. A 
brief analysis of the program revealed that 
global reorganization would have filled 
approximately 30 percent of the ones 
remaining. By knowing the first six 
instructions of various runtime library 
routines and their register usage, another 
30 percent could have been saved. Approx¬ 
imately, 15 percent of the slots could not 
have been filled by any of the current tech¬ 
niques. Last, approximately 25 percent of 
the slots were due to complicated branch¬ 
ing among short blocks. The architectural 
solution to sequencing hazards described 
earlier could have been used to cover these. 

Table 2 presents statistical performance 
data for RCA’s 32-bit GaAs microproces¬ 
sor, using the above described synergism 
methodology and the “standard” bench¬ 
marks from Gross. 14 Data from Table 2 
show a relatively good performance. Still, 
strong research efforts are needed in the 
area of code optimization, in the condi¬ 
tions typical of GaAs technology. When 
analyzing data from Table 2, keep in mind 
that the peak speed of this microprocessor 
is 200 MIPS. For the GaAs machine the 
branch latency would be six and the load 
latency would be seven (with the store 
latency equal to zero). The peak speed of 
the silicon counterpart microprocessor is 
40 MIPS. For comparison purposes, in the 


RCA’s 32-bit silicon machine, 5 the branch 
latency is three and the load and store 
latencies are also three. Both machines 
have an ALU latency of one. 

G allium arsenide technology is 
rapidly approaching VLSI levels 
of integration. In order to better 
exploit the strength of this new technol¬ 
ogy, its weaknesses must be avoided as 
well. Presently, we have identified the low 
transistor count of GaAs chips and the 
relatively slow off-chip environment as the 
principle limitations to achieving the 
improvement factor indicated by gate 
speed alone. Because GaAs does not have 
the transistor count capability to offset the 
slow off-chip environment, it must rely on 
the compiler writer to enable a system to 
present an optimum execution environ¬ 
ment, and to make up for what the 
architect cannot achieve with his hardware 
design. Of course, the architect must pro¬ 
vide a hardware design that lets the com¬ 
piler designer make optimal use of existing 
hardware resources. 

The optimal solution is in achieving a 
synergism between the hardware and the 
compiler (“synergism methodology”). In 
that case, together they achieve perform¬ 
ance that neither of the two can achieve 
alone. In this article we have shown several 
examples of this cooperation, which we 
perceive as vital for the best exploitation 
of this new technology. We are currently 
working on algorithmic details necessary 
to improve the above strategy. 
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Computer Science Program Accreditation: 

The First-Year Activities of the 
Computing Sciences 
Accreditation Board 

Taylor Booth’ and Raymond E. Miller 


This report summarizes 
the activities of the 
Computing Sciences 
Accreditation Board 
(CSAB) from its incep¬ 
tion in 1984 through 
its first accreditation 
cycle completed in 
June 1986. 


T his report summarizes the activi¬ 
ties of the Computing Sciences 
Accreditation Board (CSAB) 
from its inception in 1984 through its first 
accreditation cycle completed in June 
1986. The major activities during this 
period were directed at developing the 
CSAB structure necessary to carry out the 
accreditation process, and at conducting 
the first round of accreditation visits and 
actions. 

Accreditation of educational programs 
that prepare a student for entry into a 
profession can play an important role in 
improving the quality of educational pro¬ 
grams in that field and furthering the 
development of the profession. The effec¬ 
tiveness of any accreditation effort 
depends on the close cooperation and 
interaction between the appropriate tech¬ 
nical and professional societies, the aca¬ 
demic community, and the practitioners 
involved in the profession. This interaction 
was clearly understood by the Association 
for Computing Machinery (ACM) and the 
Computer Society of the IEEE (CS/IEEE) 
when they formed the Computing Sciences 
Accreditation Board (CSAB) in 1984. 1 


'The final draft of this article was completed by the 
second author after the untimely death of Taylor 
Booth on October 20, 1986. 


Accreditation criteria used to evaluate 
undergraduate programs establish mini¬ 
mum standards that all graduates from the 
program must satisfy. These criteria must 
be flexible enough to accommodate a 
range of educational approaches, while 
ensuring that all graduates have a strong 
background in the fundamentals and prac¬ 
tices considered necessary to enter the 
profession represented by the criteria. 

The members of the joint ACM/Com¬ 
puter Society of the IEEE task force that 
developed the current Computer Science 
Accreditation Criteria brought a wide 
range of academic and industrial view¬ 
points to the discussions. They realized 
that the criteria had to ensure that each stu¬ 
dent would have a balanced broad-based 
education and a solid foundation in the 
basic concepts and practices necessary to 
enter the computer field. Using these 
objectives, they developed a set of draft 
criteria and circulated the draft document 
to a range of computer professionals from 
industry and higher education for com¬ 
ments and recommendations. Copies of 
the draft material were also sent for com¬ 
ment to the departments listed in the ACM 
Administrative Directory. Workshops were 
held at numerous major computer confer¬ 
ences to provide an opportunity for open 
discussion of the proposed criteria by 
members of the computer profession. 
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In the extensive feedback that the task 
force received on the draft criteria, the 
majority of the comments were positive 
and offered thoughtful suggestions for 
improvements. However, the task force also 
received letters strongly opposing the idea 
of accrediting computer science programs. 
After evaluating all comments and sugges¬ 
tions, the task force made many revisions 
before submitting the document to CSAB 
with the recommendation that it be 
adopted as the criteria document for evalu¬ 
ating and accrediting computer science 
programs. These criteria are presented in 
reference 1. A brief summary of items 
covered in the general criteria section is 
given in the sidebar. A detailed discussion 
of this process and the various reactions 
received to the draft criteria are covered in 
Engel and Dalphin, 2 Jones and Mulder, 3 
Mulder and Dalphin, 4 and Myers. 5 

When CSAB accepted these criteria, 
they realized that many individuals still had 
reservations about parts of the criteria or 
about the need for computer science 
accreditation. The board felt, however, that 
the generally positive reaction to the 
criteria and widespread acceptance of the 
need for accreditation were sufficiently 
strong for the initiation of the accredita¬ 
tion process to proceed. It was also felt that 
one or two years of actual accreditation 
actions would provide the experience 
needed to tune the criteria to solve prob¬ 
lems that arose during the evaluation of 
actual programs. 


The first year 

CSAB was organized in the fall of 1984 
and incorporated in 1985 with the goal of 
starting accreditation visits during the 
1985-1986 academic year. Intensive plan¬ 
ning for the first accreditation cycle 
occurred during the spring of 1985, and the 
first 31 evaluation visits were undertaken 
in the fall. During the first year, the board 
retained operational responsibility for 
overseeing the complete visit and accredi¬ 
tation process. In future years the accredi¬ 
tation process will be the responsibility of 
the Computer Science Accreditation Com¬ 
mission (CSAC), which was formally 
established by the board in June 1986 to 
conduct evaluations and determine 
accreditation actions for baccalaureate 
computer science programs. The follow¬ 
ing discussion indicates how the visitation 
process was organized by CSAB and car¬ 
ried out during the first year of operation. 
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Selecting programs for the first accredita¬ 
tion visits. The request to have CSAB evalu¬ 
ate a computer science program is a 
voluntary decision on the part of each 
institution. When an institution decides to 
seek accreditation, it must submit a formal 
application to CSAB for an evaluation by 
a CSAB evaluation team and agree to con¬ 
duct a detailed self-study of the program. 
The team uses the information from the 


self-study and the information developed 
from its on-campus interviews and inves¬ 
tigations to determine how well the pro¬ 
gram offered for evaluation satisfies the 
published criteria, and to verify that all stu¬ 
dents graduating from the program follow 
plans of study consistent with the criteria. 
The results of the evaluation are reported 
to both the visited institution and to CSAB. 
Before taking an accreditation action, 


CSAB Computer Science Criteria Outline 

Curriculum 

than 20 percent of teaching 
coverage 

1 % year: Core and advanced com¬ 
puter science 

1 year: Science and mathematics 
(supporting disciplines) 

1 year: Humanities, social science, 

Faculty load and respon¬ 

communications (general educa¬ 
tional requirements) 

sibility 

'k year: Electives 

Activity in the computer science 
profession 

Professional development 


Laboratory and comput¬ 

Scholarly activities 

Professional activities 

ing resources 

Research 

Reasonable teaching load 

Computer facilities 

Reasonable class size 

Both large and small systems 
Laboratory computer facilities 
available for hands-on use 

One access hour per day per 

Reasonable advising load 

course for each student 

1 Students 

Software support as needed 
Laboratory facilities 

Admission standards that ensure 

Two students/laboratory station 

capable students 

Sufficient instrumentation 

Grading standards that ensure qual¬ 

Personnel support 

ity of accomplishment 

Well-developed student advising pro¬ 


cess in place 

Faculty 


Five full-time equivalent (minimum) 
Four full-time faculty with primary 

Institutional support 

commitment to program 

Faculty support 

Broad range of interests in computer 

Salary 

science 

Travel (special value of national 

Demonstrated professional com¬ 

conferences) 

petency in computer science 

Professional development 

Competent to teach full range of core 

Administration 

computer science courses 

Positive constructive leadership 

Desirable—majority of faculty with 

Library Support 

terminal degree 

Books 

Adjunct and other part-time faculty 

Reference publications 

used in a supporting role 

Journals and transactions 

Adjunct faculty responsible for less 

Secretarial and technician support 
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CSAB provides the institution with an 
opportunity to respond to the findings of 
the visitation team and to supply other 
information that would help CSAB make 
a final accreditation decision. If, on the 
basis of all information developed during 
the visit and the response from the institu¬ 
tion, CSAB finds that the program satis¬ 
fies the criteria, it accredits the program for 
a period of either three or six years. If 
CSAB finds that a program differs sub¬ 
stantially from the criteria, it notifies the 
institution that the program is not 
accredited and suggests changes to improve 
the program. Each institution receives a 
final statement from CSAB outlining the 
findings. 

Because of the effort required to recruit 
and train teams for campus visits, CSAB 
decided to limit its first cycle to about 30 
evaluations. The ACM Administrative Direc¬ 
tory lists over 450 institutions offering four- 
year undergraduate programs in computer 
science. To all these institutions CSAB sent 
invitation letters announcing its intention 
to start the accreditation of computer 
science programs in 1985-1986, and invited 
them to nominate their program for pos¬ 
sible selection as one of the programs to be 
visited during the first year. More than 150 
institutions replied, with approximately 
120 indicating that they would like to have 
their computer science program evaluated 
in the first cycle. A few indicated a prefer¬ 
ence for a later cycle; the rest indicated that 
they did not wish to participate in the 
accreditation process. Since the number of 
programs requesting evaluation in the first 
cycle far exceeded CSAB capabilities, a 
selection process was devised. For this first 
cycle the board felt that the programs 
selected for evaluation should represent a 
cross section of computer science pro¬ 
grams offered in the United States. To 
accomplish this goal, each program was 
assigned to a subgroup according to a 
number of program and institutional 
characteristics, including size and geo¬ 
graphic location. A quota of visits was 
assigned to each subgroup, and programs 
were randomly selected from subgroups 
until the quotas were filled. Under this sys¬ 
tem about 35 institutions were selected, 
and after some inevitable withdrawals, the 
programs of 31 institutions were evaluated. 

CSAB notified each institution whose 
program was selected, enclosing an infor¬ 
mation packet explaining how the evalua¬ 
tion was to be conducted and what was 
required, in addition to the self-study, to 
prepare for the visit. The postponed insti¬ 
tutions were also notified and told that they 
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would be given a higher weighting for 
selection during the second year. 

Activities from program selection to 
accreditation decisions. After an institution 
has been notified that one of its programs 
has been selected for CSAB evaluation, 
preparations for the evaluation by the 
institution and CSAB must proceed. The 
institution must prepare the self-study 
documents and collect supporting 
materials for the on-site visit. CSAB must 
select an evaluation team and ensure that 
the team is prepared for the visit. After the 
visit, reports must be written and edited, 
and a preliminary report sent to the insti¬ 
tution to allow the institution to respond 
to any errors in facts and actions that the 
institution may have taken following the 
visit. CSAB also gathers information 
about the evaluation team member and 
team chair performance to aid in forming 
teams in future years. Details of these 
activities can be found in Appendices A 
and B. 

The accreditation 
decision 

Having completed the process of self- 
studies, team visits, preliminary state¬ 
ments, and due process responses from 
those institutions desiring to respond, the 
next step was to make the initial accredi¬ 
tation decisions. Since CSAC was still 
being organized, the board, together with 
the team chairs, met as an interim 
CSAC.* This meeting was held in Las 
Vegas on June 20-22, 1986. The meeting 
was run by the CSAB president with the 
help of the past president. 

Making the accreditation decisions. Each 
program was discussed at length, with the 
team chair for the program providing a 
presentation of the team’s findings, the 
preliminary statement results, and the due 
process responses. This was followed by a 
motion for a recommended accreditation 
action. The meeting participants asked 
probing questions and held discussions to 
get details about the program and recom¬ 
mended action. This was followed by a vote 
on the motion or amended motion by all 
meeting participants. 


'As soon as CSAB became operational, the board 
appointed an interim CSAC executive committee and 
charged the committee to draw up a proposed set of 
bylaws and rules of procedures for CSAC. This task 
was completed by the June 1986 meeting and CSAB 
voted to establish CSAC. The 1986-1987 program 
evaluation cycle and the accreditation action meeting 
will be run directly by CSAC. 


As the meeting progressed, a specially 
appointed “consistency committee,” made 
up of four CSAB and interim CSAC execu¬ 
tive committee members, was monitoring 
the discussions and accreditation votes on 
all of the programs. This committee was 
charged with detecting instances in which 
programs with similar characteristics 
received different accreditation actions. 
Thus, at the end of the two-day meeting, 
after all programs had been discussed and 
accreditation actions taken, this commit¬ 
tee requested a review of several groups of 
programs for which actions appeared to be 
inconsistent. These groups of programs 
were then reconsidered, followed by the 
final accreditation actions being approved 
by the interim CSAC participants. Follow¬ 
ing this interim CSAC meeting, CSAB met 
as a board. One of the actions of this 
CSAB meeting was the ratification of the 
CSAC accreditation actions. 

The accredited programs. The 23 institu¬ 
tions having programs accredited by CSAB 
were the following: 

• California State University, 

Sacramento; 

• California State University, 

Stanislaus; 

• California State Polytechnic Univer¬ 
sity, San Luis Obispo; 

• University of California, San Diego; 

• University of California, Santa 
Barbara; 

• US Air Force Academy; 

• Georgia Institute of Technology; 

• Iowa State University; 

• Brandeis University; 

• Northeastern University; 

• Worcester Polytechnic University; 

• Western Michigan University; 

• University of Missouri, Rolla; 

• Mississippi State University; 

• North Dakota State University; 

• Stevens Institute of Technology; 

• New Jersey Institute of Technology; 

• Pace University; 

• Allegheny College; 

• Drexel University; 

• Clemson University; 

• North Texas State University; and 

• University of Utah. 

All 31 programs evaluated received 
notification of the accreditation action and 
a final statement after the meeting. In addi¬ 
tion, each of the institutions whose pro¬ 
grams received a not-to-accredit action 
received details of the CSAB appeals policy 
and procedure as well as instructions that 
they could request a revisit by CSAB in the 
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1986-1987 cycle if they felt that their pro¬ 
grams now would meet the CSAB criteria. 


Overview of the first- 
year visit 

The number of institutions requesting 
evaluation of their computer science pro¬ 
grams by CSAB in the first year was 
gratifying. It demonstrated an extensive 
institutional interest in computer science 
accreditation. Since this response far 
exceeded CSAB plans and capabilities for 
visits during the first year, it allowed 
CSAB to experiment with how well the 
CSAB approach, including the criteria, fit 
the wide variety of approaches found in 
computer science education. 

Program selection. In order to select a 
broad spectrum of programs to be visited, 
but provide a fair chance to be selected for 
each institution that applied, the categori¬ 
zation and random choice process dis¬ 
cussed earlier was developed. This 
provided a broad cross section of programs 
in the 31 institutions that were selected and 
actually visited. 

As an example of this diversity, about 
one-third of the programs were in arts and 
science organizations, one-third in engi¬ 
neering organizations, and the remaining 
one-third in organizations unique to their 
particular campus. This diversity also 
appears in the 23 programs accredited in 
this first CSAB accreditation cycle; and this 
seems to demonstrate the great flexibility 
available under the CSAB criteria for insti¬ 
tutions to develop accreditable computer 
science programs consistent with the insti¬ 
tutions’ own purposes or goals. 

Problems facing the selected institutions. 

Those institutions selected for the first-year 
visits faced some unusual problems. The 
CSAB criteria were developed after most 
of these programs were started. Thus the 
design and implementation of the pro¬ 
grams were independent of the CSAB or 
any other overall standard criteria for com¬ 
puter science programs. In contrast, in 
areas where accreditation has been going 
on for many years, it is common for the 
programs to be designed to meet the 
criteria for accreditation. 

The curriculum guidelines from ACM 
and the CS/IEEE have been helpful in set¬ 
ting up or revising many computer science 
programs. However, these guidelines tend 
to discuss only the courses forming the 
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computer science segment of a program 
rather than treating the numerous other 
factors included in the CSAB criteria. The 
difference between the ACM and 
CS/IEEE guidelines and the CSAB 
criteria, including the differences in the 
number of courses recommended for the 
computer science segment, were not always 
fully understood by the institutions. There 
were some instances of the people from 
institutions claiming that their programs 
met the ACM (or CS/IEEE) guidelines 
and therefore were accreditable, whereas 
the evaluation teams based their decisions 
on the CSAB criteria. 

Also, since computer science accredita¬ 
tion was new, many of the people involved 
were not familiar with what was required 
in preparing for an accreditation visit. 
Some department chairs were confused by 
details of the criteria, or had difficulty in 
interpreting the instructions for complet¬ 
ing the self-study documents. Also, deans 
of the nonengineering colleges tended to be 
less familiar with accreditation than engi¬ 
neering deans. CSAB made efforts to 
alleviate these problems by inviting the 
selected institutions to send observers to 
the CSAB training sessions for program 
evaluators, and by encouraging the institu¬ 
tions to call the CSAB office if they had 
any questions. Both of these opportunities 
were taken advantage of by the institutions, 
and this aided significantly in helping the 
institutions prepare for the CSAB 
evaluation. 

Problems in meeting CSAB criteria. As 

the accreditation process proceeded, a 
number of common problems emerged for 
programs to meet the CSAB criteria. Some 
programs did not have a sufficient number 
of qualified faculty assigned to the pro¬ 
gram. This was because they had an 
insufficient number of faculty devoted to 
the program or because the faculty was 
unable to teach a broad spectrum of 
courses, as well as make scholarly contri¬ 
butions to the computer science discipline. 
Also, programs were often understaffed, 
resulting in class sizes or teaching loads 
exceeding the limits in the criteria. Some 
programs lacked laboratory resources, 
resulting in students not having ready 
access to a variety of computing environ¬ 
ments or to courses that included sufficient 
software development practice. 

Among the most common problems, of 
course, were deficiencies of one kind or 
another in meeting the curriculum require¬ 
ments. There were shortages in the number 
of computer science courses or the balance 


between core and advanced material in 
computer science. Also, some programs 
did not meet the CSAB requirements for 
science. In some cases, the equivalent of a 
two-semester sequence in a laboratory 
science was satisfied, but the requirement 
for two additional courses in science to 
develop breadth or depth was not met. 
Another deficiency, particularly in pro¬ 
grams that had a more practical emphasis, 
was difficulty in meeting the general edu¬ 
cation requirement for one year of study in 
the humanities, social sciences, and other 
disciplines used to broaden the educational 
background of the student. It is particu¬ 
larly important to notice that even though 
the CSAB accreditation is a discipline- 
oriented accreditation, the criteria call for 
other aspects of education within the pro¬ 
gram, and the team did indeed check that 
these components were also included 
within the programs. For example, both the 
science and general education require¬ 
ments are viewed as necessary for accredi¬ 
tation to be granted. 

In some cases, the institutions them¬ 
selves detected these deficiencies while 
developing their self-study document. 
When this occurred, the institution were 
often able to make adjustments in their 
curriculum that would remove the defi¬ 
ciency. In other cases, these deficiencies 
were discovered by the CSAB team during 
some part of the visit cycle. When they were 
detected, the team informed the institution 
so that the institution could consider 
remedial action. This often resulted in 
adjustments being made shortly after the 
completion of the accreditation visit and 
in time so that the changes could be 
reported to CSAB before the final accredi¬ 
tation action was taken. Generally speak¬ 
ing, the institutions were very responsive in 
their attempts to correct any deficiencies 
that were noted. 

Team performance. The effort of CSAB 
to select teams appropriate for each insti¬ 
tution appears to have paid off in how the 
institutions reacted to the visiting teams. 
Often the institutions expressed satisfac¬ 
tion with the team members, and in the 
thoroughness of their work. Appendix C 
provides a number of unsolicited com¬ 
ments that reflect the reactions of the insti¬ 
tutions to the team members. Fortunately, 
personal conflicts among the teams, or 
lack of confidence in the team procedures 
or findings, were extremely rare. 

Similarly, the teams experienced a very 
high level of cooperation from the institu¬ 
tions during the visits. CSAB teams on 
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occasion requested scheduling changes and 
additional documentation, especially con¬ 
cerning course texts, syllabi, homework, 
and test examples. There were only rare 
instances where these requests were snot 
responded to in a prompt and cooperative 
manner. On the whole, the CSAB teams 
were very well prepared for their visits and 
the institutions were impressed with the 
professional competence of the teams. 

As can be expected, the institutions were 
particularly interested in hearing the find¬ 
ings of the teams. Before the exit interview 
the team chair would informally review the 
preliminary findings of the team with the 
department head to make sure that no 
major mistakes or misunderstanding had 
occurred. This was followed by a more for¬ 
mal exit interview with top administrative 
officers of the institution. 

Often weaknesses or deficiencies found 
by the teams simply confirmed what the 
institutions already knew either from their 
previous operation of the program or 
through the CSAB self-study process. 
There were, however, numerous instances 
in which issues were raised that had not 
been considered by the institution. In most 
cases, the institutions accepted the findings 
of the team gracefully, and when deficien¬ 
cies were discussed, the institutions tended 
to inquire about ways to correct these 
problems. 

Usually, it was clear that the administra¬ 
tions were very interested in their computer 
science programs and were looking for 
ways to strengthen them so that they would 
be accreditable and of high quality. On rare 
occasions, it became clear that there was a 
gulf of misunderstanding between the 
administration and the department over 
what level of support was required to build 
an accreditable program. Even in these 
cases, however, the CSAB process of evalu¬ 
ation seemed to be very helpful to the insti¬ 
tutions in clarifying the issues. 

The postvisit activity. After the team 
leaves the institution immediately follow¬ 
ing the exit interview, the next formal com¬ 
munication with CSAB is the preliminary 
statement to the institution. Since most of 
the team chairs had no previous experience 
in writing these statements, the editors had 
to extensively rewrite the statements so that 
they clearly indicated if the programs did 
or did not satisfy each of the CSAB 
criteria. In some cases, this required a 
detailed study of the full postvisit reports 
and phone calls to the team chairs. This 
rewriting process delayed the mailing of the 
preliminary statements to the institutions; 
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however, in all cases, the statements were 
sent to the institutions in sufficient time for 
them to formulate a response. 

Most institutions replied to the prelimi¬ 
nary statement with a “due process 
response.” The range of corrective actions 
and program revisions taken by the insti¬ 
tutions vividly demonstrated the positive 
impact that CSAB accreditation can have 
on improving the quality of undergraduate 
computer science programs. The responses 
were indeed amazing. In numerous 
instances, new faculty positions were made 
available at competitive salary rates. At one 
institution in which the lack of senior 
leadership was noted as a major problem, 
a funded endowed chair became available! 
Where instructional laboratory equipment 
was noted as deficient, some institutions 
responded by purchasing additional equip¬ 
ment in setting up new laboratories. Cor¬ 
rections to curricula were also numerous, 
with additional requirements or firming- 
up of requirements being made. Computer 
science segments were strengthened with 
additional course requirements. Addi¬ 
tional science requirements were put in 
place where needed, and general education 
requirements were added to provide the 
breadth of education required by the 
criteria. It is tempting, indeed, to provide 
specific examples of these changes, but 
confidentiality of the process between 
CSAB and the institutions precludes this. 
Suffice it to say that of the 23 institutions 
receiving accreditation, significant 
improvements were made in over two- 
thirds of the programs. 


Future CSAB activities 

The second year of CSAC/CSAB 
accreditation activities actually started in 
the fall of 1985, when institutions were 
notified of the opportunity to request that 
their computer science programs be con¬ 
sidered for CSAB accreditation in the 
1986-1987 cycle. Over 60 institutions 
responded, including those remaining 
from the first year queue. In January 1986 
40 institutions were selected for a visit, 
again by a random process. However, this 
time those institutions in the queue from 
the previous year were given a higher 
chance of being selected. It is expected that 
between 30 to 40 CSAB evaluation visits 
will actually be made during the fall of 
1986.' As is normal, a few programs have 


’Thirty-five institutions actually had evaluation 
visits in the 1986-1987 cycle. 


reason to withdraw after being notified of 
their selection. 

The 1987-1988 cycle was started in the 
fall of 1986. Although it is too early to 
know what level of interest will be shown 
for this third year, the satisfaction 
expressed through the first-year experience 
and the actual accreditation of 23 pro¬ 
grams in the first year seem to indicate that 
the numbers will remain high. 

In the long run, there appears to be 
somewhere around 350-400 baccalaureate 
programs in the United States that would 
be appropriate for CSAB computer 
science accreditation. This would impose 
a steady-state evaluation load of around 
90-100 campus visits per year. It will, how¬ 
ever, be a number of years before such a 
level is reached. As the number of visits 
made each year increases, it will be a chal¬ 
lenge to the ACM and the Computer Soci¬ 
ety to find a sufficient number of team 
chairs and program evaluators to accom¬ 
modate such a load. 

COPA recognition. In the near future 
CSAB expects to apply for formal recog¬ 
nition by the Council on Postsecondary 
Accreditation (COPA) and by the US 
Department of Education. The board has 
kept these agencies apprised of the CSAB 
activities up to this point. Since a track rec¬ 
ord of accreditation experience, and 
interest from institutions to have their pro¬ 
grams accredited, is needed before recog¬ 
nition can be obtained, these initial 
accreditation cycles are the necessary first 
steps prior to applying. Certainly the 
requests for accreditation so far and the 
estimated total number of programs 
already demonstrate the need for accredi¬ 
tation. Also, there are strong indications 
that the accreditation process will materi¬ 
ally improve the quality of baccalaureate 
computer science education in the United 
States without imposing undue restrictions 
on exactly how institutions decide to imple¬ 
ment their programs. 

Soaety and industrial support. Both ACM 
and the CS/IEEE continue to provide 
strong support for CSAB and the accredi¬ 
tation process. This support is given both 
monetarily and by supplying highly quali¬ 
fied volunteers in all stages of the process. 
There appears to be strong interest from 
industry as well, since accredited programs 
ensure some level of quality in the bac¬ 
calaureate education. It is customary for 
the cost of the accreditation to be borne 
jointly by the institutions being accredited 
and the societies of the profession. Initially, 
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extra funds are needed to cover some of the 
startup costs. Thus, additional monetary 
support from industry, foundations, and 
government agencies is being sought. 

Criteria revisions. Accreditation in 
general and some particulars of CSAB 
accreditation have met with opposition 
from various quarters; for example, see 
Myers 5 and Gibbs and Tucker. 6 There are 
some university administrators who feel 
that the CSAB accreditation of computer 
science programs impinges on their flexi¬ 
bility for designing computer science pro¬ 
grams that fit into the particular character 
or mission of their institution. This oppo¬ 
sition has, to some extent, focused on the 
liberal arts-type programs. Other opposi¬ 
tion comes from particular concerns about 
the CSAB criteria, with fears that the com¬ 
puter science segment of the criteria is so 
large that it constrains the breadth of edu¬ 
cation that a student would obtain in four 
years. 

Even though the variety of programs 
accredited in the first year demonstrate that 
considerable flexibility is possible under 
the current CSAB criteria, 1 CSAB and its 
two sponsoring societies have been atten¬ 
tive to these concerns. First, two work¬ 
shops were held by the sponsoring societies 
to discuss the question of computer science 
programs for small and/or liberal arts col¬ 
leges. The results of these workshops is 
reported elsewhere. 2 Also, the societies 
and CSAB have discussed these issues at 
length. As a result of this, as well as expe¬ 
rience gained through the first-year 
accreditation cycle, a CSAB committee was 
formed in January 1986 to study these 
views and propose possible changes to the 
CSAB criteria. 

After extensive discussion by the CSAB 
board at its June 1986 meeting, preliminary 
approval was given to a number of pro¬ 
posed changes in the CSAB criteria for 
computer science. These proposed new 
criteria have been sent out to the two soci¬ 
eties and all institutions with baccalaure¬ 
ate computer science programs for review 
and comment. It is expected that CSAC 
can complete a full evaluation of the com¬ 
ments and recommendations about the 
proposed criteria during the fall of 1986. 
If all goes as planned, this modified criteria 
could be adopted at the January 1987 
CSAB meeting. This would enable the 
third-year accreditation cycle (1987-1988) 
to use these new criteria for evaluation. * 


'indeed, a new criteria was adopted by CSAB on 
January 23, 1987. 
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The continued monitoring and adjust¬ 
ment of the CSAB criteria is considered to 
be an important responsibility of the board 
and of CSAC. It is very important that the 
criteria be responsive to the needs of the 
academic community. A continuing dialog 
among all members of the computer 
profession will be necessary to make sure 
that the criteria is an effective vehicle for 
carrying out the long-range goal of 
improving computer science education. 

Reaction of computer science chairs. There 
has also been opposition toward computer 
science accreditation from some professors 
of computer science and some chairs of 
computer science programs. The various 
views have been discussed in print 5 ' 6 as 
well as at the biannual Snowbird meetings 
of computer science chairs of PhD- 
granting departments. At the most recent 
Snowbird meeting, in July 1986, the first- 
year accreditation results were briefly 
reported. At this meeting, and subse¬ 
quently, it has been particularly gratifying 
to see that there is considerable interest 
from these chairs in having their programs 
considered for CSAB accreditation. Also, 
some of the people originally opposing 
CSAB accreditation, through the results of 
the first-year visits, have seen that benefits 
are being derived from the process. Thus, 
they have changed their view and are now 
more favorably disposed toward CSAB 
accreditation. 

W e have described the computer 
science accreditation activities 
of the Computing Science 
Accreditation Board from the inception of 
the board in 1984 through July 1986. 
Emphasis was placed on the activities sur¬ 
rounding the first-year accreditation cycle 
of 1985-1986, which culminated in 23 insti¬ 
tutions receiving CSAB accreditation for 
their computer science programs. Partic¬ 
ular points of interest include descriptions 
of how the process operates, how visiting 
teams are selected, how accreditation deci¬ 
sions are made, and how modifications are 
sought in the evaluation of this accredita¬ 
tion process. 

The activities to date show a continuing 
and growing interest by academic com¬ 
puter science programs to have their 
undergraduate programs accredited by 
CSAB, and a building of sentiment sup¬ 
porting the accreditation activity. 

The board awaits reaction to the pro¬ 
posed criteria changes as it continues to 
seek improvements in the criteria that will 


provide maximum benefit to improve 
undergraduate computer science educa¬ 
tion. At the same time the second- and 
third-year accreditation activities are 
proceeding on schedule, with 35 institu¬ 
tions scheduled for visits in 1986-1987 and 
invitations having been mailed for the 
1987-1988 cycle. 

It is expected that CSAB will continue 
to interact with ACM, the Computer Soci¬ 
ety, and the computer science community 
to further improve the computer science 
accreditation activities. Also, some 
interest has been shown for considering 
accreditation of other computing pro¬ 
grams; in particular in the information sys¬ 
tems area. Discussions have been held 
among representatives from ACM, Data 
Processing Management Association, and 
CS/IEEE about the possibility of accredi¬ 
tation criteria for information systems, but 
these discussions appear to still be in their 
early stages. Also, CSAB has had discus¬ 
sions and made presentations to a number 
of other groups about their activities and 
possible extensions of its accreditation 
activities to other computing areas. CSAB 
will continue to be happy to interact with 
appropriate professional societies and 
other groups wherever CSAB assistance 
might be deemed helpful. □ 
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Appendix A: The evaluation process 


Previsit preparation. To prepare for its on- 
campus evaluation visit, each school must con¬ 
duct an extensive self-study to demonstrate how 
its program satisfies CSAB’s published accredi¬ 
tation criteria and must submit a two-volume 
report on the self-study results to CSAB head¬ 
quarters by June of the calendar year in which 
the visit is expected to take place (for the 
1985-1986 cycle, this report was due in June 
1985). 

This report covers curriculum organization 
and content, faculty qualifications, institutional 
support, student selection, faculty salary struc¬ 
ture, and laboratory facilities. (CSAB’s only 
interest in faculty salary levels is to judge whether 
or not they are adequate to recruit and retain 
qualified faculty members.) 

Each visit involves a specially trained visiting 
team consisting of an experienced team chair 
and two program evaluators, all of whom are 
qualified computer scientists. Each team chair, 
upon being notified of an assignment, becomes 
the primary CSAB contact with the institution, 
whose dean has also been given the name and 
address of the team chair. The team chair also 
takes responsibility for organizing the details of 
the evaluation visit. This process starts by con¬ 
tacting the institution to select a visit date. Once 
the visit date is established, the team chair 
selects, from a list provided by CSAB, the two 
program evaluators needed to complete the visit¬ 
ing team. At this point, the institution is asked 
to approve the composition of the team and to 
send each member copies of the self-study report 
and other materials that the team would need to 
carry out the visit. 

Conduct of the on-campus visit. The on- 

campus visit is a most important step in the 
accreditation process since it provides an oppor¬ 
tunity to carry out a thorough evaluation of the 
curriculum offered by the institution, the qual¬ 
ity of the faculty, and the commitment of the 
institution to the program. The team chair has 
very wide latitude in conducting the particular 
combination of campus interviews and campus 
investigations that he or she deems necessary to 
make a thorough evaluation. In some unusual 
situations, the team chair may elect, after con¬ 
sultation with the appropriate CSAC/CSAB 
officials, to extend the visit for an additional day 
after negotiating an appropriate additional 
charge with the institution’s representatives. 

The team program evaluators are usually 
responsible for visiting the department and talk¬ 
ing with the department head and faculty, while 
the team chair conducts interviews with the 
different administrative officers responsible for 
the support and operation of the program. These 
assignments provide the team with a good over¬ 
view of the support of the institution’s adminis¬ 
tration, the status of the program, and the 
commitment of the faculty to the growth and 
development of the program. 

The departmental visit provides an opportu¬ 
nity for evaluating the quality of the departmen¬ 
tal laboratories, for inspecting sample class 
work, and for seeing the facilities provided to the 
faculty to carry out their work. This information 
is used to verify and expand upon the informa¬ 
tion in the self-study and to understand how the 
published curriculum is actually presented to the 


students in the program. 

A baccalaureate computer science curriculum 
almost always includes courses taught by other 
departments. Thus, part of the evaluation pro¬ 
cess includes a visit to these supporting depart¬ 
ments. These visits provide both an insight into 
the contribution that these departments make 
to the computer science curriculum and an 
opportunity to determine how the faculty from 
outside computer science view the quality of 
computer science students. 

Another important part of the visit is the 
opportunity it provides to meet and talk with the 
students currently in the program. These meet¬ 
ings provide an insight into the quality of the stu¬ 
dents, their evaluation of how well they feel that 
the program meets their needs, and their percep¬ 
tions of the program’s prerequisite structure. 

At the end of the visit the team conducts an 
exit interview with the chief executive officer of 
the institution and other top administrators. 
This meeting reviews the tangible findings that 
the team has developed in comparing the pro¬ 
gram with CSAB criteria requirements and gives 
the institution an opportunity to correct errors 
in the team’s perceptions. By the end of the meet¬ 
ing, it is expected that the institution will have 
a general overview of the team’s impression of 
the program. However, since there are several 
additional stages of the accreditation process 
that must be completed before the final accredi¬ 
tation decision is made by CSAB, the team is 
careful not to indicate what the recommended 
accreditation action will be. 

Postvisit activities. Immediately after the visit 
program evaluators prepare a visit report 
describing their findings and recommendations. 
These reports are sent to the team chair, who pre¬ 
pares a complete confidential visit report for the 
CSAB files and a shorter summary, called the 
Proposed Statement to the Institution, that will 
form the basis of the report sent to the institu¬ 
tion. These reports are sent to aCSAC “first-tier 
editor” together with the team’s accreditation 
recommendation. This editor has the responsi¬ 
bility of editing the Proposed Statement to the 
Institution so that it is consistent with CSAB 
criteria and the information contained in the 

After “second-tier editing” by a senior CSAC 
officer and formatting in the CSAB office, the 
proposed statement becomes the Preliminary 
Statement For Review and Comment, which is 
sent to the institution. Usually no indication of 
the recommended accreditation action is given 
in the statement (unless the program is clearly 
so deficient that it may not be accredited). The 
institution has 30 days to respond formally, 
pointing out errors or omissions in the statement 
or providing additional information it considers 
important in reaching a final accreditation deci¬ 
sion. Many of the institutions use the time 
between the the visit and the receipt of the 
preliminary statement to correct problems indi¬ 
cated at the exit interview. 

The formal response of the institution is sent 
to the team chair for evaluation. In many cases, 
the team chair will modify the preliminary state¬ 
ment to reflect the information in this response. 
Once the suggested corrections are made, the 
team chair forwards the revised proposed state¬ 


ment through the editing chain to CSAC 
together with a revised accreditation recommen¬ 
dation. 

The final accreditation action. In the future the 
final accreditation action will be taken by the 
Computer Science Accreditation Commission of 
CSAB. For the first year the function of CSAC 
was carried out by a special committee made up 
of all of the chairs of the accreditation teams. 
This committee, acting as a committee of the 
whole, made the final accreditation decision on 
all programs visited, and these decisions were 
ratified by CSAB. 

Three accreditation actions are possible. The 
program under review can be accredited for six 
years, a 6V action; it can be accredited for three 
years, a 3V action; or it can be denied accredi¬ 
tation, an NA action. A 6V accreditation is given 
to a program judged to meet all the published 
criteria and is in a stable mode of operation. The 
3V action is given to programs that essentially 
meet all of the criteria requirements but have 
some potential instabilities that, if left uncor¬ 
rected, would possibly become unaccreditable 
programs. Such programs are judged to require 
a subsequent visit in three years to be sure that 
they still satisfy the accreditation criteria. When 
CSAC finds that a program seriously deviates 
from the published criteria, it denies accredita¬ 
tion to the program. 

During the meeting of CSAC, each program 
being considered for accreditation is reviewed in 
an oral presentation by the team chair respon¬ 
sible for the visit. This review is followed with 
the team chair’s motion for a recommended 
action. After discussing the visit report, the final 
accreditation action is taken by a majority vote 
of the members of CSAC. Following this vote, 
the team chair is responsible for preparing a 
Final Statement to the Institution, which reflects 
the findings of the visiting team, the response 
made by the institution to the preliminary state¬ 
ment, and the final action of CSAC. This state¬ 
ment is enclosed with the CSAB president’s 
letter stating the final action taken by CSAC. 

If the action is not to accredit, then the insti¬ 
tution can appeal the decision to CSAB if it feels 
that there were errors in fact or that incorrect 
procedures were used to reach this decision. A 
3V action is not appealable. CSAB reviews all 
such appeals, usually by appointing a three- 
person appeals hearing committee, and tries to 
resolve the points raised by the institution in the 
appeal. CSAB has full discretionary power to 
reverse, modify, or sustain a not-to-accredit 
action of CSAC. 

If an institution that receives a not-to-accredit 
action feels that it has corrected the deficiencies 
in its program, it may request an immediate 
revisit during the next accreditation cycle. This 
request is normally honored by CSAB. On the 
other hand, the institution may decide to wait 
until it has had more time to modify its program 
before requesting another evaluation visit. 
CSAB keeps confidential the names of institu¬ 
tions whose programs were evaluated but not 
accredited. Also, the names of institutions 
applying to CSAB for accreditation, those 
selected, and any of the reports on individual 
programs generated by CSAB are kept con¬ 
fidential. 
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Appendix B: Selection, assignment, and support of the evaluation teams 


The success of the accreditation process 
depends on the quality of the visiting teams. 
Thus, CSAB has established a selection and 
training process ensuring, as far as possible, that 
every team member has a broad background in 
computer science and an in-depth understand¬ 
ing of the accreditation process. 

Selection of team chairs and program evaluators. 

The initial selection of all persons who serve as 
team chairs or program evaluators is the respon¬ 
sibility of the Education Board of the ACM and 
the Educational Activities Board of the Com¬ 
puter Society of the IEEE. Each society has 
established a selection committee responsible for 
identifying prospective candidates. These can¬ 
didates are selected according to broad guide¬ 
lines established by CSAB to ensure that each 
candidate has the experience and knowledge 
necessary to properly evaluate a program. 

The primary selection requirement for any 
candidate is a recognized professional standing 
in the computer field. A candidate from acade¬ 
mia would normally be expected to be a profes¬ 
sor or associate professor in a department with 
an active computer science or computer engi¬ 
neering program. Similarly, candidates from 
industry or government would normally be 
expected to hold a senior-level professional or 
management position involving interaction with 
recent graduates from undergraduate computer 
science programs, or close interaction with com¬ 
puter science education through an adjunct 
professorship or similar arrangement. 

After CSAB received the nominations, it 
selected an initial group of prospective team 
chairs. These persons were invited to attend a 
team chair training session held in New Orleans 
in February 1985. The persons who attended this 
first training session served as the team chairs for 
the visits made during the fall of 1985. Some 
members of this group had previous accredita¬ 
tion experience, serving as part of an Accredi¬ 
tation Board for Engineering and Technology 
(ABET) or a regional accreditation team. 

Once the selection of team chairs was com¬ 
pleted, CSAB then selected a much larger group 
of potential program evaluators. The members 
of this group attended training workshops in the 
spring of 1985. (The workshops were geographi¬ 
cally distributed for the convenience of atten¬ 
dees). These workshops were also open to 
representatives of departments who were expect¬ 
ing a fall 1985 visit so that they could develop an 
understanding of the goals of the visit and how 
a visit would be conducted. 

Visitation assignments. Before the team assign¬ 
ments were made, each institution received a list 
of potential team chairs and program evaluators. 
They were asked to strike the names of anyone 
on the list that they felt would have a conflict of 
interest or who would not be able to give a fair 
evaluation of its program. Similarly, each team 
chair and program evaluator was asked to indi¬ 
cate those institutions that he or she should not 
visit because of any known conflict of interest. 

After receiving this “challenge” information, 
CSAB selected a team chair for each institution. 
Special care was taken to make sure that the per¬ 
son’s background was compatible with the insti¬ 
tution and that there was an adequate 


geographical separation between the location of 
the institution and the team chair’s place of 
employment. Once the person indicated a will¬ 
ingness to accept the assignment, the institution 
was notified. The team chair was then respon¬ 
sible for selecting the rest of the team and mak¬ 
ing the arrangements for the visit. 

To complete the team, each team chair was 
provided with a list of approximately five pos¬ 
sible program evaluators. This list was formu¬ 
lated to ensure that the background of the 
persons on the list would be compatible with the 
institution to be visited. The team chair was free 
to select the two program evaluators to complete 
the visitation team from this list. 

Observers. In some cases arrangements were 
made to allow an additional person to accom¬ 
pany the team as an observer at no cost to the 
institution. Typically, an observer was a team 
chair who wanted to observe an accreditation 
visit before chairing a team on a visit or a mem¬ 
ber of the CSAB staff who wished to observe the 
operation of a visit. Observers were allowed to 
go on a visit only after the chair of the team 
received prior approval from the department 
being visited. 


Preparation for a visit. Each team member 
received a copy of the self-study report prepared 
by the institution and copies of six to 10 tran¬ 
scripts of students who had graduated from the 
program being evaluated. By the time the team 
arrived on campus, it was expected that the team 
members were familiar with the curriculum to 
be evaluated, that they had considered the back¬ 
ground and qualifications of the faculty, and 
that they had a general understanding of the 
organization of the department and the 
resources devoted to the program. Typically, each 
team member would identify specific areas that 
he or she felt required special attention during 
the visit. 

To assist in this study CSAB provided a set of 
summary forms that each visitor could use to 
collect the data necessary to check that the pro¬ 
gram satisfied each part of the accreditation 
criteria. 

The visit. Each visit covers a two-day period. 
Typically, the visit starts on a Sunday evening, 
when the team meets in executive session to 
review the material developed during the previsit 
preparation. The first day is used by the team to 
compare the actual program with that described 
in the self-study document. It is the responsibil¬ 
ity of the team to make sure that the program 
satisfies the published CSAB criteria and that all 
the students graduating from the program com¬ 
pleted the published curriculum requirements. 
This is accomplished by talking with the students 
and the faculty, by visiting the laboratories and 
the departments offering supporting course 
work, and by checking out the support services 
provided by the institution. 

While the team members are visiting the 
department, the team chair is responsible for 
visiting with the administrators, including the 
chief educational officer, responsible for the pro¬ 
gram. These visits allow the chair to assess the 
level of support offered by the institution for the 
program and to determine how the administra¬ 


tion views the quality and accomplishments of 
the department. These visits also provide an 
opportunity for the team chair to offer sugges¬ 
tions that might improve the program. 

In the evening of the first day the team mem¬ 
bers meet and conduct a preliminary evaluation 
of the results of their investigations and make a 
tentative decision about the accreditation action 
that they feel is appropriate. This discussion will 
typically identify areas that need additional 
investigation or raise questions that must be 
answered by talking to specific individuals. At 
the conclusion of this meeting the team chair will 
make assignments for the second day. 

During the morning of the second day each 
team member will complete assignments and 
collect the information needed to complete the 
program evaluation. By the time the team mem¬ 
bers meet at noon, they are in the position to 
reach a final decision on the accreditation action 
that they will recommend to CSAB. Typically, 
the team has a private lunch during which they 
draft statements of findings about the program. 

The final part of the visit is the exit interview 
with the administration of the institution. At this 
interview the team members will review the 
observations and general impressions they 
gained during the visit. Although they will not 
inform the institution of the accreditation rec¬ 
ommendation that they will make to CSAB, they 
will, by the tone of their comments, try to indi¬ 
cate both the strong points and the weak points 
of the program as well as how well it satisfies the 
accreditation criteria. This meeting also provides 
an opportunity for the institution to clear up 
possible misconceptions that the team may have 
developed or to ask for clarification of specific 
points raised in the discussion. Immediately fol¬ 
lowing the meeting, the team leaves the campus. 

Postvisit activities. Once the visit is complete, 
each evaluator submits a comprehensive con¬ 
fidential visit report to the team chair. This 
report provides a factual summary of each visi¬ 
tor’s findings and indicates, in the view of the 
visitor, the strong and weak points of the pro¬ 
gram. The team chair uses this material to cre¬ 
ate two documents: a final confidential visit 
report and a suggested preliminary statement to 
the institution. 

The confidential visit report is an internal 
CSAB document and is not forwarded to the 
visited institution. This report provides a perma¬ 
nent record of the visit and is used by the officers 
and professional staff of CSAB as a reference 
document when dealing with the institution after 
the visit. 

The initial findings of the visit team, but not 
their accreditation recommendations, are com¬ 
municated to the institution by a document 
called the Preliminary Statement to the Institu¬ 
tion. This is a report that presents, in a formal 
manner, the major observations and conclusions 
of the visiting team. In particular, it details spe¬ 
cific problem areas detected and indicates areas 
where the evaluated program may fail to satisfy 
CSAB criteria. The suggested preliminary state¬ 
ment to the institution submitted by the team 
chair forms the basis of this document. This 
statement is edited by two editors to make sure 
that its content and form correspond to CSAB 
requirements. During the editing process, the 
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editors may refer to the visit report or call the 
team chair to clarify specific points that are not 
clearly stated. 

When the preliminary statement is sent to the 
institution, the institution is told that the state¬ 
ment represents a preliminary evaluation of the 
program. Members of the institution are asked 
to respond to this document by correcting any 
errors in facts detected and by indicating any 
actions that the institution may have taken since 
the visit that would modify the conclusions con¬ 
tained in the statement. The institutional 
response is a written report indicating what they 
believe to be errors in fact. They may also pro¬ 
vide a list of documented changes that the insti¬ 
tution has made that they feel might impact 
some of the preliminary findings and conclu¬ 
sions developed by the visiting team. 

Since the major comments contained in this 
report will have already been presented by the 
team chair during the exit interview, it is quite 
common to find that the institution takes 
immediate action to correct detected problems. 
Thus, when the response of the institution is 
received, it is sent to the team chair for evalua¬ 
tion. If this evaluation indicates that the insti¬ 
tution has corrected some or all of the detected 
problems, the chair may recommend that the 
statement to the institution be modified to reflect 
these changes. If the changes are significant 
enough to influence the evaluation of the pro¬ 
gram, the chair may also modify the recom- 

The final step in the accreditation process 
occurs when all the team chairs meet as the Com¬ 
puter Science Accreditation Commission to 
make the actual accreditation decisions. Each 
chair receives copies of all of the revised state¬ 
ments to the institutions for review at the meet¬ 
ing. As each program is discussed, the team chair 
makes a presentation that outlines the strengths 
and weaknesses of the program and the recom¬ 
mended accreditation action. After a period of 
discussion, which might lead to a suggested 
change in the recommended accreditation 
action, a motion is made to take a specific 
accreditation action. This action becomes the 
official action of CSAB if it is passed by a 
majority of CSAC. 


Evaluation of the visiting team. CSAB recog¬ 
nizes that the quality of the persons making up 
the visiting team is of key importance to the suc¬ 
cess of the accreditation activity. Thus, special 
evaluation procedures have been established to 
monitor team performance. 

Each institution is asked to fill out a confiden¬ 
tial evaluation of the performance of the visit¬ 
ing team. The institution is told that the 
evaluation will be used only by the officers and 
professional staff of CSAB to monitor team per¬ 
formance and will not have any influence on the 
final accreditation action. Sometimes special sit¬ 
uations occur where institutions may feel that 
they are being treated unfairly by the visiting 
team or a member of the team. In that case, they 
are told that they may contact one of the CSAB 
officers and that every effort will be made to cor¬ 
rect the problem. 

The team chair is asked to evaluate the per¬ 
formance of each program evaluator and each 
program evaluator is asked to evaluate the per¬ 
formance of the team chair. This evaluation is 
in the form of a questionnaire filled out shortly 


after the visit and returned to CSAB headquar¬ 
ters. A special evaluation committee reviews 
these questionnaires and uses them to guide the 
selection of team chairs and visitors for the com¬ 
ing year. 


Appendix C: 

Unsolicited 
commentaries from 
institutions 

Iowa State University (10/86). “I was especially 
pleased that the visiting team, in its report, gave 
such high marks to our faculty and our students 
for their quality and to our administration at all 
levels for its commitment to the excellence of this 
program.” 

‘ ‘Please convey our thanks to the visiting team 
for its helpful observations and recommen¬ 
dations.” 

Brandeis University (7/86). ‘‘Your thorough 
and fair evaluation of our strengths and weak¬ 
nesses has had very positive effects.” 

‘ ‘Our experience with the CSAB evaluation is 
that it has played a very important role in 
emphasizing to the university’s administration 
the needs and responsibilities involved in offer¬ 
ing a degree in computer science.” 

Worcester Polytechnic Institute (9/86). “It is a 
pleasure and an honor to be included in the first 
programs to be accredited by CSAB. You can be 
certain that we will continue to live up to your 
high standards of accreditation.” 

‘ ‘Please be certain to convey our thanks to the 
program evaluators who visited us. Dr. J. A. N. 
Lee did a first-rate job as chairman and Mack 
Adams and Dennis Frailey were thorough and 
communicative. The accreditation means even 
more for the quality of this team.” 

Pace University (9/86). “We congratulate 
CSAB for the professionalism with which the 
accreditation cycle was carried out in this first 
year, and we note that our participation in the 
accreditation effort has no doubt had a positive 
effect on our program. We share CSAB’s com¬ 
mitment to enhanced quality of computer edu¬ 
cation in the United States, and we look forward 
to a long and productive association with you. ” 

Drexel University (10/86). “I first want to 
express my appreciation for the efficient and 
professional way in which you and the other 
team members of the CSAB visiting team went 
about gathering data at Drexel for your accredi¬ 
tation report. I especially appreciate the very spe¬ 
cific summary report which the team presented 
. . . before you left.” 

“The thorough job which you did as a team 
when you visited Drexel will certainly help us in 
those goals.” 

Unidentified institution that received a not-to- 
accredit action. “Let me express my appreciation 
for the thorough and professional review that the 
CSAB has provided. Also, please extend my 
gratitude to the members of the visiting team for 
its efforts. I found the team’s report to be most 
instructive.” 
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CONFERENCE HIGHLIGHTS 

FTCS has been the premier conference 
for the dissemination and discussion of 
new research in dependable systems. 
This year, 48 papers have been selected 
for presentation at FTCS-17 out of 202 
submitted. In addition to the outstanding 
papers, two panel discussions have been 
organized. The first panel will address 
Issues in the Design of Large Automated 
Systems. The second panel will focus 
on Reliability Modeling of Life-Critical 
Systems. Featured in this year's techni¬ 
cal program are topics dealing with soft¬ 
ware and distributed processing issues, 
system modeling and evaluation techni¬ 
ques, and on-line and concurrent test¬ 
ing methods. 


On Monday afternoon, an organized 
visit to the Carnegie Mellon University 
campus is planned. Attendees of 
FTCS-17 will have an opportunity to get 
acquainted with the research activities 
and facilities in the Computer Science 
and Electrical and Computer Engineer¬ 
ing Departments. Efforts are also under¬ 
way to secure industrial support in pro¬ 
viding travel scholarships for graduate 
students working in the fault tolerance 
related areas. For more information 
please contact: 

Prof. John R Shen 
Carnegie Mellon University 
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Session 3A: 10:30 am-Noon (P&LE Room) 

Robust Software 

Chair: R. Schlichting, U. Arizona 

"Crash Recovery For Binary Trees," D.J. Taylor (U. Waterloo) 

"A Locally Correctable AVL Tree," I.J. Davis (U. Waterloo) 

“Avalon: Language Support for Reliable Distributed Systems,” M.P. 

Herlihy and J.M. Wing, (Carnegie Mellon) 

Session 3B: 10:30 am-Noon (B&O Room) 

Coding Theory and Self-Checking 

Chair: E. Fujiwara, N.TT. 

“A Systematic Code For Detecting t-Unidirectional Errors," N.K. Jha 
and M.B. Vora, (U. Michigan) 

"On Unordered Codes,” B. Bose (Oregon State U.) 

“Three-Level Totally Self-Checking Checker for 1-out-of-n Code,” D.L. 

Tao, P.K. Lala, and C.R.R Hartman, (Syracuse U.) 

Session 4A: 1:30-3:00 pm (P&LE Room) 

Fault-Tolerant Software 

Chair: T. Anderson, U. Newcastle 

"Hardware- and Software-Fault Tolerance: Definition and Analysis of 
Architectural Solutions," J.-C. Laprie, J. Arlat, C. Beounes, K. Kanoun, 
and C. Hourtolle, (LAAS) 

"Data Diversity: An Approach to Software Fault Tolerance,” P.E. 

Ammann and J.C. Knight, (U. Virginia) 

"Community Error Recovery in N-Version Software: A Design Study 
with Experimentation," K.S. Tso and A. Avizienis, (UCLA) 

Session 4B: 1:30-3:00 pm (B&O Room) 

On-Line Testing 

Chair: J. Hayes, U. Michigan 

"Concurrent Error Correction in Unidirectional Linear Arithmetic 

Arrays," C.-C. Wu (Nat’l Taiwan Inst, of Tech.) and T.-S. Wu (Nat'l Cheng 

Rung U.) 

"Extended Precision Checksums,” N.R. Saxena and E.J. McCluskey, 
(Stanford) 

“An Approach to Fault Diagnosis in Multimicrocomputer Systems: 
Algorithms and Simulation,” D.R. Avresky, K.A. Boyadjiev, S.G. Dinev, 

S.G. Marinov, and S.D. Slavkov, (Higher Inst, for Mech. and Elect. 

Eng., Sofia) 





















Session 5A: 3:30-5:00 pm (P&LE Room) 

Evaluation-based Design of Dependable Software 

Chair: J. Goldberg, SRI Int'l 

“A Conceptual Model for Multi-version Software," B. Littlewood (The 

City U.. London) and D.R. Miller (Geo. Washington U.) 

"An Empirical Study of Software Error Detection Using Self-Checks," 

S.D. Cha (UC Irvine), N.G. Leveson (UC Irvine), T.J. Shimeall (UC 

Irvine), and J.C. Knight (U. Virginia) 

“Efficient Scheduling in a TMR Database System," F.M. Pittelli (US 

Naval Academy) and H. Garcia-Molina (Princeton) 

Session 5B: 3:30-5:00 pm (B&O Room) 

Fault-Tolerant System Architecture 

Chair: T.B. Smith, IBM T.J. Watson Research Center 

“A General Purpose Cache-Aided Rollback Error-Recovery (CARER) 

Technique," D.B. Hunt (Hewlett-Packard) and P.N. Marinos (Duke) 

“A Highly Available Storage System Using the Checksum Method," Y. 

Dishon and C.J. Georgiou, (IBM T.J. Watson) 

"Design of a Fault-Tolerant Signalling Transfer Point Within a Telecom¬ 
munications Network," P. Richardson (Monash U.), D. Mansor (La 

Trobe U.), T.S. Dillon (LaTrobe U.), and K.E. Forward (Melbourne U.) 

Panel: 5:15-6:30 pm (P&LE Room) Banquet: 7:30-10:00 pm 

Reliability Modeling of Life-Critical Systems 

Chair: R.M. Geist, Clemson U. 


WEDNESDAY, JULY 8 


Session 6A: 8:30-9:30 am (P&LE Room) 

System-Level Diagnosis 

Chair: A. Dahbura, AT&T Bell Labs 

“System-Level Fault Diagnosis in Malicious Environments,” R. Gupta 
and I.V. Ramakrishnan, (SUNY Stonybrook) 

"System-Level Fault Diagnosability in Probabilistic and Weighted 

Models," G. Sullivan (Yale) 

Session 6B: 8:30-10:00 am (B&O Room) 

Robust Algorithms 

Chair: S. Reddy, U. of Iowa 

“A Synthesis Approach to Design Optimally Fault-Tolerant Network 
Architecture," A. Sengupta (U. So. Carolina), P.D. Joshi (U. So. 

Carolina), and S. Bandyopadhyay (U. Windsor) 

“A Concurrent Error Detecting Conjugate Gradient Algorithm on a 
Hypercube Multiprocessor;’ C. Aykanat and F. Ozguner, (Ohio 

State U.) 

“Fault-Tolerant Convolution Using Real Systematic Cyclic Codes,” R. 

Redinbo (UC Davis) 

Session 7A: 10:00 am-Noon (P&LE Room) 

Dependability Evaluation Through Modeling and Measurement 

Chair: J. Meyer, U. Michigan 

“Performance Analysis of a Generalized Upset Detection Procedure," 

DM, Blough andG.M. Masson, (Johns Hopkins) 

"A Performability Analysis of Two Multi-Processor Systems," R.M. Smith 
and K S. Trivedi, (Duke) 

“Monte Carlo Simulation of Computer System Availability/Reliability 

Models," A.E. Conway (McGill) and A. Goyal (IBM T.J. Watson) 

“Software Dependability of a Telephone Switching System," K. Kanoun 
(LAAS) and T Sabourin (Alcatel Commutation) 

Session 7B: 10:30 am-Noon (B&O Room) 

Reconfigurable Arrays 

Chair: R. Gueth, Brown Boveri Research Laboratory 

“Using Redundancy for Concurrent Testing and Repairing of Systolic 

Arrays," L.A. Shombert and D.P. Siewiorek, (Carnegie Mellon) 

"A Reconfigurable Modular Fault-Tolerant Binary Tree Architecture," 

A.D. Singh (U. Mass.) 

“Reconfiguration of VLSI Arrays: A Covering Approach," F. Lombardi 
(U. Colorado), R. Negrini (Politecnico Milan), M.G. Sami (Politecnico 

Milan), and R. Stefanelli (Politecnico Milan) 

Lunch: Noon-1:15 pm (Santa Fe/Topeka/Atchison Rooms) 

Session 8A: 1:15-2:45 pm (P&LE Room) 

Built-In Self Test 

Chair: S. Murad, Stanford 

“Self Test Using Unequiprobable Random Patterns,” H.-J. Wunderlich 
(U. Karlsruhe) 

"On the Reconvergent Structure of Combinational Circuits With Appli¬ 
cations to Compact Testing," B.B. Bhattacharya and S.C. Seth, (U. 

Nebraska) 

■Methodologies for Testing Embedded Content-Addressable 

Memories," P. Mazumder and J.H. Patel, (U. Illinois) 

Session 8B: 1:15-2:45 pm (B&O Room) 

Reconfigurable Systems 

Chair: Y. Tohma, Tokyo Institute of Technology 

“Cost Analysis of On Chip Error Control Coding for Fault-Tolerant 

Dynamic RAMs," N. Jarwala and D.K. Pradhan, (U. Mass.) 

“Algebraic Test Generation for Arithmetic Units," A. Chatterjee and J.A. 
Abraham, (U. Illinois) 

“Yield and Reliability Enhancement of Large-Area Binary Tree Archi¬ 
tectures," M.C. Howells and V.K. Agarwal, (McGill) 

Meeting: 3:00-4:30 pm (P&LE Room) 

Fault-Tolerant Computing Technical Committee 



















FTCS-17 Advance Program 


REGISTRATION 

Attendees are encouraged to register in advance using the form 
below. On-site registration will be held prior to the cocktail reception 
on Sunday, July 5 from 4:00 to 8:00 pm and each morning beginning 
at 7:30 am. The registration fee includes one copy of the conference 
proceedings; additional copies may be purchased at the conference. 
For registration information contact: 

Prof. Gary M. Koob 
Registration Chair 
(412) 268-3310 

ARPANET: gmk@gauss.ECE.CMU.EDU 

SOCIAL EVENTS AND SIGHTSEEING 

The planned social calendar begins with a cocktail reception Sunday 
evening and continues with lunches on Monday and Wednesday and 
a banquet Tuesday evening. These four events are included in the 
standard registration fee, but are not covered by the student fee. 
Additional lunch and banquet tickets may be purchased at the con¬ 
ference. A visit to the CMU campus is planned for Monday afternoon. 
(Transportation will be provided.) 

During your free time and for lunch on Tuesday, you can either 
remain in the Station Square area (which contains several fine shops 
and restaurants) or take the short walk across the Smithfield bridge to 
downtown Pittsburgh and its many distinctive shopping and eating 
locations like PPG Place, Market Square, and Oxford Center. For 
sightseeing, ride up one of the inclines to the top of Mount 
Washington for a spectacular view of the city or take a cruise around 
"the point,” where Pittsburgh's three rivers converge. Other points of 
interest include the Carnegie Institute and the Phipps conservatory, 
both of which are adjacent to the CMU campus, and the Buhl 
Science Center on the North Side. 


THE HOTEL 

The Sheraton Hotel is located in Station Square on the Monongahela 
River at the base of Mount Washington. Please contact the hotel 
directly to make reservations. Use the form below or call (412) 
261-2000 and mention that you are attending FTCS-17. 

AIRPORT TRANSPORTATION 

The Greater Pittsburgh International Airport is located 20 miles from 
the conference site. A shuttle service to the Sheraton Hotel at Station 
Square leaves the airport on the hour Sundays from 2:00 pm to 9:00 
pm and weekdays from 8:00 am to 9:00 pm. The one-way cost is 
$8.50. The departure point can be found by following the Ground 
Transportation signs to the information booth located near the US Air 
baggage claim area in the airport’s lower level. Return shuttles leave 
the Sheraton at a quarter past the hour from 7:15 am until 8:15 pm 
weekdays. 

A taxicab can be hired in the same area of the airport. A trip to the 
hotel will cost approximately $30. Numerous car rental companies 
also have operations at the airport. Complimentary parking at the 
hotel is available to registered guests. 

GETTING AROUND TOWN 

Downtown Pittsburgh is located directly across the river from Station 
Square. In nice weather, it can be reached by a pleasant walk across 
the Smithfield bridge. It is also easily accessed by a 3/4 mile ride on 
the city’s new mini-subway which runs from the end of Station Square 
opposite the Sheraton to three stops within the downtown region. 
Most downtown features (including Point State Park) are within 6 
blocks of a subway station. The city’s policy of no bus fares in the 
downtown region makes transportation particularly convenient. Other 
areas of the city are also easily accessible by bus. 


FTCS-17 Registration Form 


FTCS-17 Registration 
RO. Box 27 

Carnegie Mellon University 
Pittsburgh PA 15213 
USA 

Advance* On-site 
IEEE member $190 □ $220 □ 

Non-member 230 □ 270 □ 

Student 50 □ 60 □ 

’Must be received by June 15,1987 


IEEE Member No. _ 
Amount enclosed _ 


Hotel Reservation Form - FTCS-17 
July 5-8, 1987 

Reservations must be received by June 15,1987. 

Please mention FTCS-17 when making reservations or complete 
this form and mail directly to: 

The Sheraton Hotel at Station Square 
7 Station Square Drive 
Pittsburgh PA 15219 USA 
(412) 261-2000 

Single $85 □ Double $95 □ 


Name_ 

Affiliation_ 

Address_ 

City_State_Zip_ 

Country____ 

Arrival Date_Hour_ 

Departure Date___ . -_ 

To guarantee reservation for late arrival (after 4 pm.), please pro¬ 
vide the following information: 

□ Visa □ M/C □ AmEx 

Card Number_ 
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Getting good information from benchmarks 


An approximate measure of the right thing 
is better than an exact measure of the 
wrong thing 

—Jack Harper et al. 

National Oceanic and 
Atmospheric Administration 
c 1972 

The trouble with benchmarks is that 
all too often we forget the Harper Prin¬ 
ciple (above) and make them more 
important than the machine we’re 
benchmarking. We measure MIPS, 
MFLOPS, and Whets/sec by the mil¬ 
lion and to three decimal places. I have 
yet to find an organization whose job it 
is to produce MIPS, MFLOPS, or 
Whets/sec. Many computer manufac¬ 
turers, software gurus, and computer 
center managers use such measures as 
an important index of the goods and 
services they offer, but the measures are 
not the bottom line any more than miles 
per gallon is the bottom line at GM. 

I like benchmarks. I’ve been writing 
and running them for about 20 years. 
They’re more informative than an aver¬ 
age “flossy glossy” sales brochure. But 
problems arise when we forget that 
benchmark results are just numbers, 
when we forget that a scalar (for exam¬ 
ple, MFLOPS) measures only one 
dimension, when we forget that a com¬ 
puter system is a multidimensional tool, 
and when we begin believing the simple 
numbers. Those are mistakes. 

Remember the advice of Richard 
Hamming: “The purpose of computing 
is insight, not numbers. ’ ’ Benchmarks 
represent a kind of computing: The goal 


behind their use is to understand the 
power and the utility of a system vis-a- 
vis a problem set. 

Some pundit has said that “‘MIPS’ 
stands for ‘meaningless indication of 
processor speed.’” The use of MIPS 
alone does not provide a measure of 
computer speed any more than rpms 
alone measure a car’s speed: You have to 
know how much each MIPS (or rpm) 
contributes to the machine’s progress. 
Comparisons of MIPS are valid only for 
systems with similar—if not identical— 
instruction-set architectures. 

Your best benchmark is the 10 percent 
to 25 percent of all your code that 
accounts for 75 percent to 90 percent of 
your computing workload. Measure that 
workload in machine cycles, bytes of 
storage, hours, or all of these. Evidence 
that new System A will do your comput¬ 
ing work better than new Systems B or 
C or than the system you have now 
should be based on how well each one 
actually does the production jobs: 

MIPS, Whets/sec, and so on are clues, 
not the answer. 

Most sites face some or all of the fol¬ 
lowing problems with workload 
benchmarks. 

•Determining which programs to 
include often takes lots of work. You 
must measure numbers of jobs, volumes 
of storage and output, and time. In 
some cases, that 10 percent subset will 
be several dozen codes. 

•Converting these codes to “stand¬ 
ard” languages takes work. For exam¬ 
ple, you should remove vendor 
extensions to ANSI Fortran (or ANSI 


Cobol, and so on) and delete other 
excess. 

•Suppose that the benchmark that 
results from completion of the preced¬ 
ing two steps runs 24 hours on your pre¬ 
sent system; it will run maybe four 
hours on each target system that you 
want to measure. Time is money to com¬ 
puter vendors; they will run such tests, 
but not for fun. 

A common alternative to workload 
benchmarks is kernel benchmarks. 

Many common kernels are often quoted 
in the literature: Examples are the Whet- 
Stone 1, the Linpack, and the Livermore 
Loops. Kernels offer the advantage that 
they are relatively more portable, and 
common comparisons exist of several 
systems that run them. But measuring 
kernels is not what you do for a living! 
However, carefully designed kernels can 
reveal a lot about systems. 

There should be a test for each 
dimension of computing performance 
that is important to your work. For 
example, there should be tests of CPU 
speed, I/O speed, memory size, arith¬ 
metic precision, graphics capability, and 
so on. (Reliability and maintainability 
are important, but they are not meas¬ 
ured by benchmarks.) One benefit of 
such very short test codes is that they 
often make disputes about correctness 
easier to resolve. In this era of radically 
new computer architectures and cor¬ 
responding new compiler implementa¬ 
tions, it can be crucial to prove whether 
old code is still right—or whether it has 
always been wrong. 

Model your computer-system needs 
and performance; determine experimen¬ 
tally the importance of memory, I/O, 
processor, and so on. Then test those 
things, being very sure to test the combi¬ 
nations of factors that will occur in 
actual work. A system that has lots of 
memory and a very fast CPU may or 
may not run very large working sets 
quickly. (If that sounds contradictory, I 
invite you to experiment.) Issues such as 
cache and virtual-memory management, 


“The Open Channel” is exactly what the name implies: a forum for the free 
exchange of technical ideas. Try to hold your contributions to one page maxi¬ 
mum in the final magazine format (about 1000 words). 

We’ll accept anything (short of libel or obscenity) so long as it’s submitted 
by a member of the Computer Society. If it’s really bizarre we may require you 
to get another member to cosponsor your item. 

Send everything to Jim Haynes at the address given above. 
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register sets, and compiler efficiency 
impact the results significantly. (As an 
example, see Technical Memorandum 
No. 23, “Performance of various com¬ 
puters using standard linear equations 
software in a Fortran environment,” by 
Jack Dongarra of Argonne National 
Labs in Argonne, Ill., which reports 
results obtained with the Linpack and 


other benchmarks.) In various sections 
of that report, several of the “crayette” 
systems show significant performance 
differences relative to each other. 

Much simpler tests that vendors have 
run for me suggest that one system in 
question focused on vector-register per¬ 
formance, while another paid more 
attention to memory management; for 


example, when running tight loops that 
crunch floating-point numbers, System 
X seemed faster, but when running 
problems (both vector and scalar) that 
require large memory, System Y was 
faster. 

Bob Estell 

Naval Weapons Center 

China Lake, Calif. 


A definition of complex instruction set computer (CISC) architectures 


How does one distinguish complex 
instruction set computer (CISC) 
architectures from reduced instruction 
set computer (RISC) architectures? 
Recently, D. Tabak* used eight criteria 
to characterize RISC architectures. 

Here, I also use eight criteria, but I use 
them to characterize CISC architec¬ 
tures. The criteria are 

(1) number of machine-language 
instructions (in the case of a CISC, 
it should be as large as possible), 

(2) number of addressing modes (this 
number should also be as large as 
possible), 

(3) number of instruction formats 
(again, the number should be as 
large as possible), 

(4) most instructions require multiple 
cycles for their execution, 

(5) various types of instructions have 
access to the memory (Load/Store 
instructions are not the only ones 
with access, which is the case with 
RISC systems), 

(6) existence of special-purpose 
registers, 

(7) microprogrammed control, and 

(8) machine instructions at a relatively 
high level—one that is close to the 
level of high-level-language 
statements. 

Criteria (1) and (2) can be specified by 
numerical values, while the others are best 
specified by a yes (Y) or a no (N) answer. 
Let’s assume that the following require¬ 
ments must be met for fulfillment of 
criteria (1) and (2): 

(1) the number of instructions must be 
greater than 100, and 

(2) the number of addressing modes 
must be greater than four. 

Table 1 presents the eight criteria as 
found in the following CISC architectures: 


*D. Tabak, “Which System is a RISC?” Computer, 
Vol. 19, No. 10, Oct. 1986, pp. 85-86. 


• the Motorola MC68020, 

• Intel 80386, 

• Fairchild Clipper, 

• Zilog Z80,000, 

• AT&T WE32100, 


• Hewlett-Packard Focus, 

• NS32000 Series, and 

• DEC VLSI VAX. 

Table 2 tells whether or not, on a partic¬ 
ular machine, the eight criteria are satisfied 


Table 1. Eight architectures are evaluated here in terms of criteria that character¬ 
ize CISCs. (Each column—(1) through (8)—represents a different criterion; for an 
explanation of the criteria, see text.)* 


System 

Criteria 

(1) 

(2) 

(3) 

(4) 

(5) 

(6) 

(7) 

(8) 

Motorola 68020 

109 

18 

Y 

Y 

Y 

Y(16) 

Y 

Y 

Intel 80386 

111 

8 

Y 

Y 

Y 

Y(8) 

Y 

Y 

Fairchild Clipper 

101 

9 

Y 

N 

N 

N(16) 

N 

N 

Zilog Z80,000 

110 

9 

Y 

Y 

Y 

N(16) 

Y 

Y 

AT&T WE32100 

169 

18 

Y 

Y 

Y 

N(16) 

Y 

Y 

Hewlett-Packard Focus 

230 

10 

Y 

Y 

Y 

Y(28) 

Y 

Y 

NS32000 Series 

86 

14 

Y 

Y 

Y 

N(8) 

Y 

Y 

DEC VLSI VAX 

304 

21 

Y 

Y 

Y 

Y(16) 

Y 

Y 


*Under certain conditions the values of the numerics and attributes in this table can 
change. 


Table 2. Eight architectures are evaluated here in terms of whether they satisfy (S) 
or violate (V) criteria that characterize CISCs. (Each column—(1) through (8)— 
represents a different criterion; for an explanation of the criteria, see text.) 


System 

Criteria 

(1) 

(2) 

(3) 

(4) 

(5) 

(6) 

(7) 

(8) 

Motorola 68020 

S 

S 

S 

s 

s 

s 

s 

S 

Intel 80386 

S 

s 

s 

s 

s 

s 

s 

s 

Fairchild Clipper 

S 

s 

s 

V 

V 

V 

V 

V 

Zilog Z80,000 

S 

s 

s 

s 

s 

V 

s 

s 

AT&T WE32100 

s 

s 

s 

s 

s 

V 

s 

s 

Hewlett-Packard Focus 

s 

s 

s 

s 

s 

s 

s 

s 

NS32000 Series 

V 

s 

s 

s 

s 

V 

s 

s 

DEC VLSI VAX 

s 

s 

s 

s 

s 

s 

s 

s 
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Table 3. Eight architectures are evaluated here on the basis of architectural charac¬ 
teristics of CISCs. (Columns (a), (b), and (c) all represent different characteristics; 
for an explanation of the characteristics, see text.) 


System 

Architectural Characteristics 

(a) 

(b) 

(c) 

Motorola 68020 

Off-chip 

On-chip 

190K (1) 

Intel 80386 

On-chip 

Off-chip 

275K (1) 

Fairchild Clipper 

Off-chip 

On-chip 

(3)* 

Zilog Z80.000 

On-chip 

On-chip 

(1)* 

AT&T WE32100 

Off-chip 

On-chip 

146K (3) 

Hewlett-Packard Focus 

Off-chip 

Off-chip 

450K (1) 

NS32000 Series 

Off-chip 

On-chip 

(Q* 

DEC VLSI VAX 

On-chip 

Off-chip 

1.2M(9) 


♦Data not available in the literature. 


(S) or violated (V) by the requirements. As 
can be seen from Tables 1 and 2, most of 
the processors analyzed satisfy most of the 
criteria and should be characterized as 
CISC machines. The only exception is the 
Fairchild Clipper processor, which com¬ 
bines features of CISC machines (these 
features include a large number of 
addressing modes and instruction for¬ 
mats) with features of RISC machines 
(these features include single-cycle 
instruction execution, Load/Store 
architecture, and hardwired control). 

Some architectural characteristics are 
erroneously treated as being typical of 
CISCs; these include a large register file, 
support for memory management and vir¬ 
tual memory, the existence of cache mem¬ 
ory, and large on-chip transistor count. 
Although architecture characteristics are 
not relevant to my classification scheme 
here, they are important, and I have sum¬ 
marized them in Table 3. In Table 3, 
column (a) defines where the memory- 
management unit is placed, column (b) 
defines where the instruction/cache mem¬ 
ory is placed, and column (c) specifies both 
the transistor count and the number of 
chips that are used to build the processor. 

Both CISC and RISC designers tend to 
incorporate a large register file in their 
machines. However, in the case of RISCs, 
these registers all tend to be general pur¬ 
pose in nature, while in the case of CISCs, 


a certain number of the registers are of the 
special-purpose type. This fact is reflected 
in Tables 1 and 2. 

It is true that CISCs are typically charac¬ 
terized by a larger on-chip transistor count 
than RISCs. However, a RISC with a large 
on-chip cache memory may incorporate an 
even larger transistor count. 

The existence of a cache memory and of 
support for memory management and vir¬ 
tual memory are all equally important in 


both CISCs and RISCs. It is true, how¬ 
ever, that most (but not all) RISCs provide 
off-chip memory management and that 
most CISCs try to incorporate at least 
some elements of memory management 
on-chip. 

Borivoje Furht 
Dept, of Electrical and 
Computer Engineering 
University of Miami 


Confessions of a used-program salesman— fringe benefits 


I have just got to tell you about a 
strange phenomenon that has been hap¬ 
pening in my used-program business. 
Ever since I have been reusing some 
special parts to refurbish programs, * 
my maintenance work has dropped off 
dramatically. I have had to shift my 
personnel around down at the body 
shop by placing more workers in manu¬ 
facturing and fewer in repair. Quite 
frankly, it has been a drain financially, 
since over half my revenue in the past 
has come from maintenance. 

On the bright side, my reputation for 
delivering defect-free products has 
increased the number of customers I 
serve. I don’t mind the shift in 


♦For a reference to the composition of programs from 
Ada packages, see “Confessions of a used-program 
salesman—an update,” Computer, Vol. 19, No. 6, June 
1986, p. 91. 


workload—and to tell you the truth, my 
workers don’t mind it either—for two 
reasons. First, manufacturing software 
with reused parts is a lot more fun than 
maintaining software. Second, main¬ 
tenance now requires less effort. It is 
easier to find bugs because they are 
almost never in the special parts or 
building blocks that we have been reus¬ 
ing in each product, but almost always 
in the glue that holds the parts together. 
Finally, new products get easier to 
assemble from these components as we 
become more familiar with the compo¬ 
nents. We are constantly salvaging new 
software pieces to add to our parts 
warehouse whenever a new program to 
refurbish comes into the factory. Busi¬ 
ness is booming. (Now, if I could only 
figure out a way to recoup my lost 
maintenance revenue. Actually, I am 
thinking about going into the parts- 
distribution business, but I haven’t 
worked out the economic and legal 
issues yet.) 

Seriously, one of the fringe benefits 


of software reuse is that the quality of 
the delivered product is increased. 
Reusable software components—in par¬ 
ticular, components designed for 
reuse—generally have a very low defect 
rate. Furthermore, with each successful 
use, a component’s confidence factor 
increases, as does the confidence of the 
programmer who reuses the component. 
Indeed, the saying that “one way to 
eliminate software bugs is by not put¬ 
ting them there in the first place” sup¬ 
ports building software from reused 
parts. Actually, I should make a distinc¬ 
tion between “plain parts” (the old sub¬ 
routine library) and “reusable parts” 
(highly parameterized generic pack¬ 
ages), but that is a topic for another 
true confession. 


Sincerely, 

Will Tracz (your friendly 
used-program salesman) 
Computer Systems Laboratory 
Stanford University 


May 1987 


109 
























oooo 

ICCkjy 


Advance 

Announcement 

FIRST INTERNATIONAL CONFERENCE 
ON COMPUTER VISION 

Royal National Hotel 
London, England 
June 8-11,1987 


CONFERENCE COMMITTEE 

Conference Cochairs: 

J. Michael Brady 

Dept, of Engineering Science 

Oxford University 

Parks Road 

Oxford 0X1 3PJ, ENGLAND 
Azriel Rosenfeld 

Center for Automation Research 
University of Maryland 
College Park, MD 20742 

Registration & Finance Chair: 
Barbara Hope 

Local Arrangements Chair: 

Bernard Buxton 


PROGRAM COMMITTEE 

Thomas 0. Binford (USA) 

Jan-Olof Eklundh (Sweden) 

Olivier Faugeras (France) 

Takeo Kanade (USA) 

Jan Koenderink (The Netherlands) 
Hans-Hellmut Nagel (German Federal Republic) 
Tomaso Poggio (USA) 

Yoshiaki Shirai (Japan) 

Saburo Tsuji (Japan) 

Steven W. zucker (Canada) 


ROYAL NATIONAL HOTEL 

Royal National Hotel is ideally placed between the City 
and West End. The hotel Is within easy walking dis¬ 
tance of the theatres, Oxford St. and Covent Garden. 
The hotel offers a wide range of restaurants, Carvers, 
Lounge Bar and fully licensed coffee shops as well as 
underground garages & shops. 


THE CONFERENCE 

ICCV is the first International Conference devoted 
solely to computer vision. It is sponsored by the 
IEEE Computer Society in cooperation with the 
international Association for Pattern Recognition 
(lAPR). it will be held in odd-numbered years, alter¬ 
nating with the IEEE Computer Society Confer¬ 
ence on Computer Vision and Pattern Recognition 
(CVPR). 


THE PROGRAM 

The program will consist of invited review papers 
and high-quality contributed papers on all aspects 
of computer vision. All papers will be refereed 
by members of the Program Committee, lb avoid 
conflicts, it is not planned to hold parallel ses¬ 
sions. A special feature of the program will be a 
set of invited talks on human perception and 
biological visual systems, as a continuation of 
the highly successful series of workshops on 
Human and Machine Vision. 

Session Topics include: 
image Flow 
Motion Detection 
Structure from Motion 
Stereo 
Matching 

Object Recognition 
3 D Shape 
"Shape from X" 

Highlights and Color 


FOR FURTHER INFORMATION 

Clip and mail to: 

ICCV'87 

c/o Computer Society of the IEEE 

1730 Massachusetts Ave., N.W., 
Washington, D.C. 20036-1903 
(202) 371-0101 


Please send me further Information on ICCV'87 as it 
becomes available. 


Name 

Affiliation 


Address 


City, State, Country, Zip 





























STANDARDS 


Editor: Helen M. Wood, National Bureau ot Standards, B154 Technology, Gaithersburg, MD 20899; (301) 975-3240. 


Data sharing 

Standard programming languages 
alone do not guarantee complete porta¬ 
bility of an application across different 
hardware systems. 

Even two programs with different 
languages executing on the same hard¬ 
ware may not be able to exchange data 
due to incompatibilities between inter¬ 
nal data formats or data types. For 
example, a floating-point number may 
not have the same representation in 
both environments. Alternately, one of 
the languages may include data types 
that are not translatable to any data 
type in another. 

To improve the situation, X3T2 has 
begun work on a standard to define 


Earlier this year, Accredited Stan¬ 
dards Committee X3 on Information 
Processing Systems approved a project 
to develop an American National 
Standard for an intelligent peripheral 
interface, logical device specific com¬ 
mand set for magnetic tapes. The actual 
development work will be carried out by 
Technical Committee X3T9, I/O 
Interface. 

This project will provide an Ameri¬ 
can National Standard interface docu¬ 
ment that will facilitate and simplify the 
design, development, and utilization of 
future automated data processing mag¬ 
netic tapes to support the IPI Physical 
standard X3.129-1986. The major 
intrinsic value of this standard will be 
the interconnection of masters to slaves 
(which contain the tape drives) requir¬ 
ing high data transfer speeds. The avail¬ 
ability of microprocessors and other 
LSI circuits will encourage the develop¬ 
ment of intelligent subsystem configura¬ 
tions. The timely development of a 
standard in this area should reduce 
incompatibilities in future peripheral 


common, language-independent data 
types. The approach taken is to assign 
identifiers for each type, specify the 
external physical representation of each 
type, the conditions under which each 
representation may or must exist, and 
specific mappings between data types. 
The standard will include specific 
procedures for modifying the range of 
data types to allow new types to be 
included in the standard as they become 
necessary. 

For more information, contact 
Leonard J. Gallagher, National Bureau 
of Standards, Technology Bldg., Room 
A266, Gaithersburg, MD 20899; (301) 
975-3251. 


subsystems, thus reducing the number 
of interfaces. This should lower devel¬ 
opment, documentation, training, and 
maintenance costs for the manufac¬ 
turer, integrator, and end user of such 
systems. 

The standard will provide logical 
definitions for use by magnetic tape 
drives containing varying levels of func¬ 
tionality. This standard is directed at 
the lower level, tape device specific 
command set which provides device 
unique functions. The scope of the 
physical and logical/command defini¬ 
tion of this proposal is broad enough to 
cover the range of medium cost systems 
to the higher cost, higher performance 
systems areas. 

Since X3T9 intends to complete the 
draft American National Standard by 
December 1987, interested participants 
and users are encouraged to get involved 
as soon as possible. For more informa¬ 
tion, contact Delbert Shoemaker, Digi¬ 
tal Equipment Corporation, 1331 
Pennsylvania Ave. NW, Sixth Floor, 
Washington, DC 20004; (202) 383-5622. 


Flexible alternatives 

Two new efforts relating to flexible 
disks are underway in Accredited Stan¬ 
dards Committee X3 on Information 
Processing Systems. 

By October 1987, Technical Commit¬ 
tee X3B8 expects to complete develop¬ 
ment of a draft American National 
Standard for data interchange on 90-mm 
(3.5-inch) flexible disk cartridges using 
modified frequency modulation record¬ 
ing at 15916 flux transitions per radiam 
use (ftprad), on 80 tracks on each side 
(2.0-Mbyte unformatted capacity). This 
effort will define a standardized track 
format to facilitate data interchange 
between systems. 

By April 1988, X3B8 expects to finish 
work on a draft American National 
Standard for unformatted 90-mm 
(3.5-inch) flexible disk cartridges at 
15916 ftprad, intended for use in drives 
currently being manufactured. System 
integrators are designing systems capa¬ 
ble of exercising the higher capacity of 
these drives. 

To join X3B8 on either of these 
projects, contact James L. Barnes, 
Polaroid Corporation, 575 Technology 
Square, Cambridge, MA 02139; (617) 
577-4526. 

Meanwhile, on the 130-mm (5.25-in) 
flexible disk front, X3 is holding a four- 
month public review and comment 
period on draft proposed American 
National Standard X3.162-198x. 

This proposed standard specifies the 
general, physical and magnetic require¬ 
ments for interchangeability for the 
two-sided, 5.25-inch (130-mm) flexible 
disk cartridge (for 13262 ftprad) as 
required to achieve unformatted disk 
cartridge interchange among disk drives 
using 77 or 80 tracks per side and 
associated information processing sys¬ 
tems. The two-sided flexible disk car¬ 
tridge enclosed in a protective envelope 
and having two recording surfaces is of 
the type intended specifically for use 
with digital recording and reproducing 
equipment employing access mechan¬ 
isms capable of positioning to these 
data tracks. 

This draft is available for public 
review and comment for a four-month 
period ending August 9, 1987. Copies 
may be obtained from Global Engineer¬ 
ing Documents, Inc., by calling (800) 
854-7179. 


Why is there so much interest in standards? What are the issues— 
economic, technical, political —involved? Who are the players? These are 
among the topics that will be considered in the Standards Department. 

Short articles on standards-related topics are sought for publication in this 
department. For details regarding scope and length, contact the department 
editor at the address given above. 


Incompatibilities reduced intelligently 
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IEEE Standards Board actions 


At its March 1987 meeting, the IEEE 
Standards Board approved two new 
microprocessor standards and a new 
“recommended practice” in the soft¬ 
ware engineering area. The three speci¬ 
fications, which were developed under 
Computer Society, are 

• IEEE 854, Radix-Independent Standard 
for Floating-Point Arithmetic, 

• IEEE 1014, Versatile Backplane Bus 
(VME), and 

• IEEE 1016, A Recommended Practice 
for Software Design Descriptions. 

To order, write or phone the Computer 
Society Publications Office, 10662 Los 
Vaqueros Circle, Los Alamitos, CA 
90720; (714) 821-8380. Standards may 
also be ordered the European Office, 2 
Avenue de la Tanche, B1160 Brussels, 
Belgium; 32 (2) 660-11-43. 

The IEEE Standards Board approved 
initiation of two new standards develop¬ 
ment projects within the Computer 
Society. Software Test Documentation 
(P829) will consider needed changes to 
the existing standard in light of four 


years of use and the subsequent devel¬ 
opment of IEEE Std. 1008 Software 
Unit Testing. For more information, 
contact David Gelperin, Software Qual¬ 
ity Engineering, 2425 Zealand Ave. N., 
Golden Valley, MN 55427; (612) 
541-1431. 

At the request of members of the 
Forth Standards Team, an independent 
standards group, the IEEE approved a 
project for the Forth programming lan¬ 
guage. Interest within the Computer 
Society is mostly concerned with the use 
of Forth in microprocessor environ¬ 
ments, while a new, broader activity is 
beginning in the Accredited Standards 
Committee X3 on Information Process¬ 
ing Systems. To help ensure that all 
interests are considered, while avoiding 
duplication of effort, discussions are 
underway to consider formation of a 
joint standards committee with X3. For 
information on this effort, contact J. 
Robert Davis, Summit Computer, 22685 
Summit Rd., Los Gatos, CA 95030; 

(408) 353-2706. 


Ada-based design 
language 


The Naval Avionics Center is 
validating an Ada-based design lan¬ 
guage that is intended to be the 
Defense Dept, standard. The depart¬ 
ment’s Ada board is reviewing the 
initial work. 

The language guideline documents 
are available from SPS, the company 
contracted to develop the language. 
The documents are ADL Guidelines 
Final Report, Guidelines for the 
Development of an Ada-based 
Design Language, and Guidelines 
for the Acquisition Management of 
a Project Using an Ada-Based 
Design Language. 

The printed version costs $40 and 
the diskette version costs $15. For 
copies of the documents or informa¬ 
tion on the validation efforts, con¬ 
tact Kathi Maxson, SPS, PO Box 
361697, Melbourne, FL 32936; (305) 
773-6510. 


Computer 

Science 

Solutions 

from 

Springer-Vteriag 


To order by phone, please call 1 -800-526- 
7254 (toll free), or in NJ, call 1 -201 -348- 
4033. For mail orders, include $1.50 for 
postage, and sales tax if you reside 
in NY. NJ.orCA. 

We accept personal checks, money or¬ 
ders, VISA. MasterCard, and American 
Express. Expiration dates must be noted 
with credit card payments. 
Springer-Verlag New York, Inc. 

Attn: G. Kiely 
175 Fifth Avenue 
New York. NY 10010 


H Springer-Verlag 


Computer-Aided Design and 
Manufacturing 

Second Edition 

Edited by U. Rembold and R. Dillmann 

In this second edition, the contributors 
highlight CAD/CAM, robot programming, 
assembly-oriented product design, quality 
assurance systems, manufacturing con¬ 
trol, and economic analysis. 

1986 458 pp. 304 figs. Hardcover 
$88.50. 

ISBN 0-387-16321-2 

(Symbolic Computation) 

Programming Languages for 
Industrial Robots 

Edited by C. Blume and W. Jakob 
A completely revised and extended trans¬ 
lation of the 1983 edition, the book serves 
as an interface between robot technology 
and computer science, allowing experts 
from the two fields to communicate at the 
same level through uniform terminology. 
1986 376 pp. 145 figs. Hardcover 
$49.50. 

ISBN 0-387-16319-0 

(Symbolic Computation) 


New York Berlin Heidelberg 
London Paris Tokyo 


Control Flow and Data Flow: 

Concepts of Distributed Programming 

Edited by M. Broy 

The book includes the comprehensive de¬ 
scriptions of approaches to the represen¬ 
tation, specification design and verification 
of distributed systems as well as very 
large scale integrated (VLSI) systems and 
parallel hardware architecture. 

1985 525 pp. Hardcover $65.00 
ISBN 0-387-13919-2 
(NATOASI, Series F, Volume 14) 

Forthcoming in 1987 
Open Problems in Communica¬ 
tion and Computation 

Edited by T.M. Cover and B. Gopinath 

With contributions by I. Csiszar, J. 

Korner. R. Ahlswede. A.D. Wyner. H.S. 
Witsenhausen, N. Sloane. A. El Gamal, 

F. Kelly, T. Cover. Y. Abu-Mostafa, J. Ziv, 
A. Barron, D. Hajela, M. Honig, E. 

Posner, L. Levin. G. Chaitin. A. Odlyzko, 

D. Coppersmith, P. Gacs. R. Gallager. 

D. Sleator. R. Tarjan, W. Thurston, 

A. Shamir, M. Fredman, P. Varaiya, 

J. Tsitsiklis, E. Gilbert, L. Shepp, J. 
Conway, S. Boyd, V. Anantharam. 

B. Gopinath, and V.K. Wei. 

1987 Approx. 225 pp. 20 illus. 

Soft cover $20.00 (tent.) 
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BOOK REVIEWS 


Editor: Wiley McKinzie, School of Computer Science and Technology, Rochester Institute of Technology, Rochester, NY 14623; Compmail, w.mckinzie; CSnet, wrm@rit 


Adieu and Welcome 

With this issue my term as Book Reviews editor expires. As the founding editor of this depart¬ 
ment in Computer some 13 years ago, I have enjoyed developing this service with the assistance 
of all of you excellent friends out there, to whom I would like to acknowledge my grateful thanks 
for your expert contributions and loyal support. I welcome my successor, new editorial board 
member Wiley McKinzie. I am sure he will value your support, suggestions, and comments 
05 ^ did- Frank P. Mathur 


Misunderstanding Media 

Brian Winston (Harvard University 

Press, Cambridge, Mass., 1986, 419 

pp„ $22.50) 

We hope to learn how to do great 
things ourselves by finding out how great 
things were done in the past. We hope 
that some understanding of the history 
of the development of the electronic 
wonders of our age will help us. To this 
end, historians, scientists, and some of 
those who actually participated in the in¬ 
ventions and developments of the recent 
past have published their studies and 
their memoirs and have sometimes sug¬ 
gested theoretical models of how tech¬ 
nological development proceeds. Much 
of this material is careful, accurate, 
dependable, and useful, but sometimes a 
seriously flawed work of this genre 
appears. This is such a book, unfor¬ 
tunately published by the press of our 
oldest university. 

The author, the dean of the school of 
communications at Pennsylvania State 
University, has a background in jour¬ 
nalism and the performing, or show 
business, side of British television. His 
exposure to that jungle led him to a dis¬ 
enchantment with the gadgetry of elec¬ 
tronic communications and information 
processing and a distrust of the commer¬ 
cial hype about the wonders of the tech¬ 
nological revolution. He made a detailed 
study of the history of the development 
of the interlocked and interrelated spawn 
of electronics and concluded that 


The major underlying assertions 
of the “information” revolu¬ 
tion-increased information, in¬ 
creased pace of change, structured 
industrial innovation—are no 
more sustainable than its detailed 
surface arguments. The flood of 
information is less significant than 
is claimed, the pace of change has 
not increased, the nature of inno¬ 
vation is unchanged. 

Furthermore, he developed his own 
unique model of how technological 
development takes place, starting with 
scientific competence and ending with 
the marketing of products. An essential 
element of his model is his “law” of the 
suppression of radical potential, which 
states that new technologies are intro¬ 
duced into society only insofar as their 
disruptive potential is contained. To 
paraphrase, his book is his argument for 
his model and his “law” and his justifi¬ 
cation for his attack on what he says is 
the fraudulent concept of an information 
revolution. Most of the text are detailed 
recitations of the histories of the tele¬ 
phone, television, computers, transistors, 
and satellites—histories that are all 
forced into the Procrustean bed of his 
model and all of which he claims support 
his “law.” 

Much of his historical story is correct, 
for he has liberally used the published lit¬ 
erature. In particular, he makes a contri¬ 
bution to better understanding of tech¬ 


nological development by showing that 
technology does not always follow 
science, the reverse order being frequent. 
But his lack of hands-on familiarity with 
the history he recites and his ideological 
commitment to his model and its essen¬ 
tial “law” leads him to ascribe all devel¬ 
opmental delays to interference by big 
business and its lackey government 
agents. He appears to be oblivious to the 
technological problems that seemed 
almost insurmountable at the time but 
that he, with hindsight, now sees to be 
easily soluble. His neglect of these dif¬ 
ficulties seems to be sometimes deliberate 
but is often merely the consequence of 
the fact that they are often ignored or 
de-emphasized in his documentary 
sources. However, his analysis of his¬ 
torical events in the light of what is now 
known was going to happen classifies his 
work as a “Whig” history, that is, a 
historical account that is deliberately 
shaped and interpreted to prove an 
ideological point. 

Examples abound in all his histories. 
Here is one. To explain why computer 
development was not lightning fast right 
after the development of Eniac, Winston 
first asserts that the “von Neumann con¬ 
stant,” a jocular explanation of why 
early computer projects were always late, 
is an example of the author’s “law” of 
the suppression of radical potential. 

Then he writes, 

The only technological question 
mark over the nascent computer 
was the form of its memory, crucial 
since this was the device’s 
distinguishing characteristic. Al¬ 
though nobody had yet used them 
for this purpose, Eckert’s mercury 
lines were to hand and were found, 
eventually, to work as planned; so 
these cannot have been a source of 


Recently published books and new periodicals may be submitted for review 
to editor at the address given above. 

Note: Publications reviewed in this section are not available from the IEEE 
Computer Society; they must be ordered directly from the publisher. 
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delay. Alternatives to the delay 
lines also appeared at this same 
time, before any computer was 
operational, so it is obvious that 
the development of these was not a 
major retarding factor either. 

Anyone who was there then will gasp 
at this nonsense. Memory devices were 
important but some hardly worked at all 
and none of them worked really well. 

The major retarding factors were not the 
author’s “obstruction of science bureau¬ 
crats” (although they were present) or 
what he calls the “shortsightedness of 
potential commercial buyers” (although 
they were present too), but all the 
varieties of pure cussedness, obduracy, 
and opacity of nature and our own lack 
of technological vision in not seeing, as 
Winston now sees, how things should 
have gone. Such nonsense is dangerous if 
taken as fact and then extended, as 


Winston does, to prove his model and 
his “law,” for it may lead us to totally 
misunderstand how great things are 
done. 

While my basic objection to the book 
is its frequent and apparently deliberate 
distortion of history, I am also troubled 
by the careless way that it seems to have 
been written and then checked by its 
publishers. (It was originally published 
by Routledge and Kegan Paul and, after 
being printed in Great Britain, was 
copublished in the United States by Har¬ 
vard University Press. That is, pre¬ 
sumably two sets of referees had at it.) 

In spite of this, the book is littered with 
easily detected errors, often in misspelled 
names. Ironically, it is the Harvard 
pioneers who suffer most. Howard 
Aiken is consistently called Aitken, 

Grace Hopper becomes Hooper, and Ed¬ 
mund Berkeley is Berkely. J. Presper 


Introduction to Data Management and File Design 


R. Kenneth Walter (Prentice-Hall, 

Englewood Cliffs, N.J., 1986, $27.95) 

This is a lower-division text in data 
organization and data access techniques. 
It is intended, according to the author, 
for students who have had a course in 
the concepts of computer hardware and 
software and are acquainted with a 
programming language. The author en¬ 
courages its use for the second course in 
computing of the ACM Curriculum 
Committee on Information Systems 
(CACM, Vol. 25, No. 11, 1982, pp. 
781-805). 

Because it is intended for use with 
students of limited background, the text 
contains much material that many 
instructors will prefer to have in other 
courses, as well as material they will be 
grateful to have in this book. I hope my 
own students know what an operating 
system is, how to convert decimal to 
hexidecimal and vice versa, and how to 
use lists and queues and stacks, before 
they take this course. Many of my 
students haven’t seen punched cards, 
paper and magnetic tape, drums, or 
enough about disks to be helpful. If one 
had to work through all of these topics 
with a class, a year might not be enough. 
If a class knows all of this, there is still 
plenty of material here for a semester— 
disk formats, hashing, partitioned and 
indexed file organizations, space alloca¬ 
tions and catalogs, brief introductions to 
channels, controls, database manage¬ 
ment systems terminology, and so on. 
There is a fairly large assortment of 
elementary data structures subroutines 


provided—in pseudocode and Basic, 
which suggests the low level at which the 
book is pitched. There are enough exer¬ 
cises, with some answers, and pitched at 
about the sophomore level. There are 
few exercises appropriate for advanced 
majors, or open-ended questions. The 
inclusion of much elementary material 
may make this a better-motivated self- 
study text for some working engineers 
than, say, a comparably elementary data 
structures text. This reviewer sometimes 
teaches an upper-division file manage¬ 
ment course in which the students have 
such a mixed background that having all 
the elementary prerequisites in the book 
for those who need them is useful—but 
then supplementary material (B-trees, 
more on database management, ad¬ 
vanced exercises) must be added. 

Two defects in this book are fairly 
common in many other lower-division 
texts. First, little indication is given to 
the student of what comes next—what is 
left to learn. For example, a chapter on 
sorting of under 20 pages can’t say very 
much about sorting, but it could include 
a page or two on how terribly important 
sorting is (how much real-world com¬ 
puter time is devoted to it) and provide 
some indications of further problems 
that the student may face later and 
where she may look for answers. The 
brevity of the chapter on database 
management is particularly distressing, 
because it doesn’t adequately warn the 
person doing file management of how 
much about database management she 
ought to know. 

Second, there is a real lack of moti¬ 


Eckert is Prosper. In addition, there are 
dozens of typographical errors, so many 
as to have left the author himself out¬ 
raged at both his publishers. 

I think I have the same feeling about 
such errors as a reader with a literary 
background would have about a history 
of Harvard that identified its current 
president as Dirk Bach and its greatest 
poet of modern times as T.S. Elliott. 

We need interpretations of techno¬ 
logical history. We need models and laws 
of technological development. We need 
to know how technological development 
is done. But we need all these things to 
be based on correct, not distorted, 
history, and it would be nice if they 
would get our heroes’ names right. This 
book has it all messed up. 


Eric Weiss 


vating real-world examples (“war 
stories”) on a nontrivial scale. Finding 
that a certain task can be done in 55 
microseconds this way or 37 micro¬ 
seconds that way is a nice exercise: but 
how do you examine a real-world prob¬ 
lem to determine if it is worth doing such 
an analysis, how much might be saved, 
what would it cost to save it, is it worth 
doing? In many practical cases, ex¬ 
perimentation is faster and easier than 
analysis—but no suggestion is made as to 
how to do this. When this reviewer has 
taught the course, an early assignment 
has been something like 

In any available language on any 
available machine: (a) write to disk 
and read back 1000 records of 50 
characters each; (b) change the 
blocking, record length, et cetera, 
by whatever means are available, 
to transfer the 50,000 characters in 
some faster way. Time all ways tried. 
The students complain about the vague¬ 
ness of the assignment, but usually get 
(b) to run in one-half to one-sixth the 
time of (a), even without understanding 
how they did it. This provides powerful 
motivation later in the course, and 
similar experiments, followed by discus¬ 
sions of when to try them “on the job” 
later, have been a popular feature of the 
course. 

A nice book for a low-level file¬ 
processing course or for self-study—if 
used in conjunction with supplementary 
material. 

Edward T. Ordman 

Memphis State University 
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Portraits of Success: Impressions of Silicon Valley Pioneers 


Carolyn Caddes (Tioga Publishing 

Co., Palo Alto, Calif., 1986, 138 pp., 

$45) 

If you’re like me, you’re intrigued by 
learning a bit about the personalities that 
go with the famous names: Who are 
those people who created the combina¬ 
tion of technology, geography, and state 
of mind that since the mid-1970’s has 
been indelibly branded as Silicon Valley? 
One way to find out—and a pleasant 
way at that—is to look through this 
handsomely produced book by photog¬ 
rapher/author Carolyn Caddes. 

Portraits of Success is a photographic 
album of 62 “brilliant inventors, vision¬ 
ary entrepreneurs, and venture capi¬ 
talists” who have played a large role in 
shaping the modern phenomenon of 
Silicon Valley. All the giants are there: 
Fred Terman, the Stanford University 
leader who by common consent merits 
the sobriquet “Father of Silicon Valley”; 
Hewlett and Packard, perhaps his most 
illustrious proteges; William Shockley, 
who, after co-inventing the transistor, 
founded the first semiconductor com¬ 
pany; and many more. Although ardent 
partisans may quibble with the inclusion 
or exclusion of specific individuals, there 
can be no doubt that the selections merit 
the title. 

Each black-and-white photograph is 
accompanied by a brief biographical 
sketch that provides a bit of insight into 
the man (and one woman—Sandra Kurt- 
zig of ASK Computer Systems) behind 
the picture. At least as interesting to me, 
Caddes describes her interaction with 
each person: for example, how she 
badgered them into submitting to a sit¬ 
ting, and how they reacted to her and 
her mountain of equipment when she 
arrived. 

Readers of this magazine will be 
pleased to note that the field of artificial 
intelligence, and modern computer sci¬ 
ence more generally, is well represented. 
John McCarthy and Ed Feigenbaum are 
there, as are Don Knuth, Alan Kay, and 
Doug Engelbart. Along with the tech¬ 
nologists, the book includes a section 
containing some of the venture capi¬ 
talists, investment lawyers, public rela¬ 
tions specialists, and, yes, headhunters, 
who have been instrumental in the 
development of the valley. Readers 
might be especially curious to discover 
more about this group of personalities, 
so often mentioned in the abstract but 
not widely known outside of business 
circles. 

Caddes invited each subject to be 
photographed in a setting that felt most 


comfortable, and this has created a 
refreshing diversity in the uniformly 
high-quality portraits. Standard business 
uniforms are well-represented, to be 
sure, but we also see Sheldon Briener of 
Syntelligence—a marathon competitor- 
in his jogging shoes, Jimmy Treybig of 
Tandem Computers at his ham radio, 
John Young of Hewlett-Packard at his 
kitchen table, and Jerry Sanders of Ad¬ 
vanced Micro Devices perched on a bed 
whose ostentatiousness might make 
Hugh Hefner blush. Choosing a favorite 
portrait in this collection is tough, but 
for me the pictures of greatest historical 
interest show the group that became 


Spreadsheet Applications 

Angelo E. DiAntonio (Prentice-Hall, 

Englewood Cliffs, N. J., 1986, 391 

pp., $17.95) 

The use of information technology in 
the collegiate curriculum offers exciting 
possibilities. With the decrease in unit 
cost and the increase in capability, com¬ 
puter use has become more and more 
common for instructional purposes. In 
particular, certain instructional areas 
have already begun to use information 
technology as a pedagogical device 
through the integration of existing “pro¬ 
ductivity” software (i.e., wordprocess¬ 
ing, spreadsheet analysis, and data- 
basing) in course work. Computer-aug¬ 
mented practices in existing curricula can 
offer novel ways of viewing course 
content/concepts, providing a new edu¬ 
cational medium for instructors. The 
movement toward increased student ac¬ 
cess to instructional computing corre¬ 
sponds to an effort to develop materials 
that provide instructors with support for 
instructional activities. The form these 
materials take varies widely, but the 
availability of these materials is essential, 
if we are to realize the potential of a 
partnership between information tech¬ 
nology and instruction. 

The book under consideration falls far 
short of demonstrating the potential of 
enriching instruction in financial 
accounting through the use of spread¬ 
sheet and information technology. 

Though the preface of the text would 
lead you to believe that the author has 
captured the essential attributes of 
spreadsheet software in the context of 
finance, the techniques and problems 


famous as the Fairchild Eight as they 
were in 1959 and again in the same pose 
in 1985. 

The biographical vignettes bring out 
certain recurring themes. It is hardly sur¬ 
prising to learn that software specialists 
Kay and Knuth are accomplished musi¬ 
cians, or that any number of entre¬ 
preneurs “keep score with money.” But 
perhaps the most prevalent theme is that 
for many of these hard-working, success¬ 
ful individuals, work is play; I like that 
thought the best. 


Peter E. Hart 


in Financial Accounting 

presented are not representative of in¬ 
teresting or powerful uses of the soft¬ 
ware. The book is merely a workbook, 
to be used in conjunction with any text¬ 
book in financial accounting, from 
which the student (or learner) types in a 
template the author has prepared (com¬ 
plete with cell references), enters values 
given by the author, and compares the 
results with those provided. To trivialize 
the matter, the author gives additional 
sets of values (reaching 25 in number) 
that can be entered and the result 
checked against solutions in the book. 

Spreadsheet software used in educa¬ 
tional practice has the potential to allow 
learners to explore the intricacies of 
quantitative decision making and sensi¬ 
tivity analysis, both of which are impor¬ 
tant in financial accounting. Yet this 
application software, as portrayed in 
DiAntonio’s book, provides nothing 
more than minor calculating capacity. 
Though the coverage of financial 
accounting problems may be thorough, 
this model for this area in collegiate 
curricula is inadequate. American educa¬ 
tion in the 1980’s and 1990’s does not 
need another “cookbook” approach to 
learning; it does not need to produce ad¬ 
ditional college graduates who are not 
skilled in the sophistication of problem 
solving and solution interpretation. We 
need instructional material that captures 
the power and elegance of computing for 
problem solving, not trivialization of in¬ 
tellectual activity. 


Richard L. Upchurch 
Southeastern Massachusetts University 
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NEW PRODUCTS 


Personal-System/2 support products unveiled 


IBM has announced printers, Solu- 
tionPac software, and application pro¬ 
grams designed for its Personal 
System/2 PC family. 

The printers can also be attached to 
other IBM and to IBM-compatible 
personal-computing equipment. 

The Proprinter II is an enhanced ver¬ 
sion of the Proprinter. A nine-wire dot 
matrix impact printer, the Proprinter II 
has a Courier near-letter-quality font 
and a 240 characters-per-second (cps) 
Fastfont type style. 

The IBM Proprinter X24 is a letter- 
quality, 24-wire dot matrix impact 
printer with high-resolution, all-points- 
addressable graphics capability. It has a 
top speed of 240 cps (12-pitch burst 
speed). 

Another printer, the Proprinter 
XL24, is a wide-carriage version of the 
Proprinter X24. 

The Proprinter II costs $549; the 
Proprinter X24 costs $799; the 
Proprinter XL24 costs $1049. 

The Quietwriter III Printer offers 
executive letter-quality print at 120 cps 
(12-pitch burst speed); it also has a 
draft mode that offers near-letter- 
quality print of 192 cps (12-pitch burst 
speed). In addition, the printer also pro¬ 
vides high-resolution, all-points- 
addressable graphics. 

Quietwriter III is priced at $1699. 

The Personal Pageprinter, a laser 
printer, is for use as part of the IBM 
Personal Publishing System—an appli¬ 
cation package for desktop publishing. 
Printing at the rate of up to six pages 
per minute, the tabletop printer 
produces documents with text, graphics, 
and image. 

It is priced at $2199. 

IBM’s SolutionPacs are tailored 
application packages designed to help 
put the IBM Personal System/2 to work 
for desktop publishers, engineers, doc¬ 
tors, lawyers, builders, and others in 
various industries. They provide soft¬ 
ware, services, and—where 
appropriate—IBM hardware. 

SolutionPacs for use with Personal 
System/2 include Personal Publishing 
System, CADwrite Design and Drafting 
System, and a host-based publishing 
system. 


Two application programs, IBM Dis- 
playWrite 4/2, a text-processing pro¬ 
gram for IBM Personal System/2 
machines running under the IBM Oper¬ 
ating System/2, and IBM Story¬ 
board/2, a presentation graphics pro¬ 
gram that can accept files created with a 
video camera or scanner, were also 
announced. 

They carry one-time license charges 
of $495 and $350, respectively. 

The IBM SolutionPac Personal Pub¬ 
lishing System is a desktop solution for 
the Personal System/2 Model 30. 
Upgrades are available for versions that 
run on the IBM PC XT Model 286 and 
on selected PC AT models. 

The publishing system can be used 
with IBM DisplayWrite 4 and other 
workstation-based applications. A Per¬ 
sonal Publishing System that includes a 
Personal System/2 Model 30 with mon- 


New VAX computers replace 
VAX 8300, VAX 8500 


Digital Equipment Corp. has 
announced three new versions of VAX 
8000 series computer systems. 



The VAX 8530 (above) provides users with 
up to four times the performance of a 
VAX 8200 system, DEC says. 


ochrome display is $8553. 

IBM SolutionPac CADwrite Design 
and Drafting System is a stand-alone 
system for two-dimensional drafting. It 
is designed for users with little or no 
CAD experience. 

The system is available at a one-time 
charge of $2740. 

IBM SolutionPac Publishing System 
VM Edition operates on most IBM Sys- 
tem/370s running the VM Operating 
System. It includes three new software 
products: Publishing Systems Process- 
Master, Publishing Systems BookMaster, 
and Publishing Systems BrowseMaster. 

IBM has not finalized the price of the 
VM Edition yet. 


Printers: R.S. No. 20 
SolutionPacs: R.S. No. 21 
Programs: R.S. No. 22 


VAX 8200, 


The VAX 8250, VAX 8350, and VAX 
8530 mid-range systems provide an 
entry-level VAXcluster system and 
replace the VAX 8200, VAX 8300, and 
VAX 8500 systems. They incorporate 
VAXBI bus technology. 

The VAX 8250 is an entry-level com¬ 
puter for use in VAXcluster system con¬ 
figurations. Digital says that the system 
provides users with up to 40 percent 
price-performance improvement over 
the VAX 8200. Prices begin at $65,000. 

The VAX 8350, which replaces the 
VAX 8200 and VAX 8300, can be used 
for stand-alone or VAXcluster system 
applications. DEC says it supplies twice 
the performance of the original VAX 
8200 system at the same price. Prices 
begin at $88,000. 

The VAX 8530, a mid-range VAX for 
all applications, provides users with up 
to four times the performance of a 
VAX 8200 system, DEC says. It 
replaces the VAX 8500 system, which, 
the company reports, was rated at three 
times the performance of the VAX 8200 
system. Prices begin at $291,000. 

8000 Series: R.S. No. 23 
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IBM extends PC product line with System/2 family 


IBM has unveiled its Personal Sys¬ 
tem/2 family, which consists of four 
PCs: Models 30, 50, 60, and 80. 

The company has also announced 
displays and disk drives that support the 
PCs. 

Integrated on the Personal/System 2 
system board and standard on all the 
PCs are the diskette controller; parallel, 
serial, and pointing device ports; key¬ 
board and memory functions; and color 
and monochrome graphics capabilities. 

The Personal System/2 family uses 
3.5-inch diskettes, which, IBM says, 
store from two to four times more data 
than 5.25-inch, 360K-byte diskettes. 

The company has announced options 
for converting stored information from 
a 5.25-inch format to a 3.5-inch format. 

IBM says that the Model 30, an Intel 
8-MHz 8086-based system, processes 
information up to 2.5 times as fast as 
the IBM PC XT. In addition to the 
standard features built onto the system 
board, the Model 30 system board con¬ 
tains 640K bytes of RAM, as well as 
enhanced graphics support called Mul¬ 
ticolor Graphics Array (MCGA). Three 
general-purpose expansion slots are 
available. 

The Model 30-002, which has two dis¬ 
kette drives, is priced at $1695. 

The Model 30-021, which has one dis¬ 
kette drive and one fixed disk drive, is 
priced at $2295. 

Models 50, 60, and 80 feature a new 
IBM-designed Micro Channel architec¬ 
ture that, the company says, contributes 
to an increase in processing power of up 
to 2 to 3.5 times that of the IBM PC 
AT. Micro Channel architecture enables 
up to 32 bits of data to flow to and 
from the processor. 

The Model 50, an Intel 10-MHz 
80286-based desktop system, provides 
1M byte of RAM (which is expandable 
up to 7M bytes), a 20M-byte fixed disk 
drive, and one 3.5-inch, 1.44M-byte dis¬ 
kette drive (a second diskette drive may 
be added). Three 16-bit, general- 
purpose expansion slots are available. 

The Model 50-021 costs $3595. 

The Model 60, also an Intel 10-MHz 
80286-based system, is a floor-standing 
system available in two configurations. 
One configuration comes with a 44M- 
byte fixed disk drive (a second 44M- 
byte fixed disk drive may be added). 

The second configuration has a 70M- 
byte fixed disk drive; it can be expanded 
with a second 70M-byte or a 115M-byte 
disk drive. 

The Model 60-041, which is equipped 
with a 44M-byte fixed disk drive, is 
priced at $5295. 


The Model 60-071, which comes with 
a 70M-byte fixed disk drive, is priced at 
$6295. 

The Model 80 contains an Intel 80386- 
based microprocessor. It is a floor¬ 
standing machine available in three con¬ 
figurations. 

Seven expansion slots are available to 
accommodate options: three 32-bit and 
four 16-bit slots. Like all Personal Sys¬ 
tem/2 models, this machine features an 
optional math coprocessor. 

One configuration of the Model 80 
runs at 16 MHz, contains 1M byte of 
memory, and features a 44M-byte fixed 
disk drive (a second 44M-byte fixed disk 
drive may be added). 

A second Model 80 configuration 
runs at 16 MHz and has 2M bytes of 
memory and a 70M-byte fixed disk 
drive (a second 70M-byte or a 115M- 
byte fixed disk drive may be added). 

The third Model 80 runs at 20 MHz 
and features 2M bytes of memory and a 
115M-byte fixed disk drive. A 70M-byte 
or a second 115M-byte fixed disk drive 
may be added for a maximum capacity 
of 230M bytes. 

The Model 80-041, which has 1M 
byte of memory and a 44M-byte fixed 
disk drive, is priced at $6995. 

The Model 80-071, which has 2M 
bytes of memory and a 70M-byte fixed 
disk drive, is priced at $8495. 

The Model 80-111, which has 2M 
bytes of memory and a 115M-byte fixed 
disk drive, is priced at $10,995. 

IBM has also announced a family of 
analog displays and unveiled disk drives 
that support the System/2 family. The 
new displays are fully compatible with 
current IBM Color Graphics Adapter 
(CGA) and Enhanced Graphics Adapter 
(EGA) graphics modes. 


Models 30, 50, 60, and 80 (left to right) of 
IBM’s Personal System/2 family. Fixed- 
disk storage capacity ranges from 20M 
bytes in the Model 30 up to an optional 
230M bytes for the Model 80. 


The IBM Personal System/2 Color 
Display Model 8512, a 14-inch color 
display, is priced at $595. 

The IBM Personal System/2 Color 
Display Model 8513, a 12-inch color 
display with finer resolution than the 
Model 8512, is priced at $685. 

The 16-inch, high-addressability IBM 
Personal System/2 Color Display 
Model 8514 costs $1550. 

The 12-inch IBM Personal System/2 
Monochrome Display Model 8503 costs 
$250. 

Other products are designed to allow 
data exchange from 5.25-inch-based 
IBM personal computers to the 3.5-inch- 
based systems. 

The IBM Data Migration Facility 
uses an existing standard printer cable 
to transfer data from an IBM PC, IBM 
PC XT, or IBM PC AT to the Personal 
System/2. 

The IBM Data Migration Facility is 
priced at $33. 

The 3.5-Inch Internal Diskette Drive 
costs $170. 

The 3.5-Inch External Diskette Drive 
is priced at $395. 

The 5.25-Inch External Diskette 
Drive costs $335. 

IBM has also introduced its 3363 
Optical Disk Drive, which it says 
increases storage capacity available to 
the new systems by at least 200M 
bytes—a capacity, the company adds, 
that is more than 550 times that of the 
industry-standard 360K-byte diskette. 

The IBM 3363 Optical Disk Drive 
Model costs $2950. 

Personal System/2: R.S. No. 

Displays: R.S. No. 

Disk drives: R.S. No. 26 
Migration facility: R.S. No. 27 
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80386 Unix is commercially available for 386-based PCs 


Microport Systems, Inc., has an¬ 
nounced what is reputedly the first com¬ 
mercially available 80386 Unix 
operating system for 386-based personal 
computers. 

The Microport Runtime System 
V/386 is designed to run on Unix Sys¬ 
tem V.3. It is upwardly compatible with 


Digital Equipment Corp. has 
unveiled the MicroVAX 2000 multiuser 
system, which, DEC says, has the CPU 
performance of the MicroVAX II sys¬ 
tem but costs half the price of the least 
expensive MicroVAX II model. 

DEC has also announced the VAX- 
station 2000 desktop workstation. Like 
the MicroVAX 2000, the workstation 
provides CPU performance equal to 
that of the MicroVAX II system. 

In addition, the company has 
announced the VAX Solution Systems 
program for work-group computing, 
under which DEC and its third-party 
vendors design, build, and test fully 
integrated systems that customers use in 
specific industry applications. The Solu¬ 
tion Systems combine DEC’S hardware, 
software, communications, and services 
with hardware and software from the 
third-party vendors. 

The MicroVAX 2000 supports up to 
four directly connected users and up to 
16 users with network connection. It 
operates in VMS and Ultrix, and in 
stand-alone or clustered environments. 

MicroVAX 2000 systems range from 
an entry-level configuration (priced 
below $10,000 net) for OEMs that has 
4M bytes of memory, a 42M-byte hard 
disk, and a 1.2M-byte floppy disk, to a 
fully configured system (list priced at 


Microport’s System V/AT for 80286- 
based PCs, and it makes available all 
user-interface enhancements and system 
improvements developed during the last 
two years. 

The Runtime System V/386 carries 
an introductory price of $299. 

R.S. No. 28 


$20,195) that features 6M bytes of 
memory, two 71M-byte hard disks, and 
a 95M-byte storage tape. The entry- 
level configuration with VMS or Ultrix 
operating system license is list priced at 
$ 11 , 100 . 

A diskless MicroVAX 2000 system, 
for use in Local Area VAXcluster con¬ 
figurations, is also available. It features 
6M bytes of memory, a ThinWire 
Ethernet interface, and appropriate 
software licenses. The list price is 
$12,900. 

The VAXstation 2000 workstation 
offers dedicated 32-bit processing, 
graphics, windowing, and networking 
capabilities. 

A diskless monochrome version is 
available for $10,500. The disk-based 
monochrome workstation is priced at 
$13,500. 

Prices for the Solution Systems begin 
at under $100,000 and depend on the 
configuration. The non-DEC hardware 
and/or software in each VAX Solution 
System is available from DEC’S System 
Cooperative Marketing Program sup¬ 
pliers and its Cooperative Marketing 
Program suppliers. 

MicroVAX 2000: R.S. No. 29 
VAXstation 2000: R.S. No. 30 

Solution System program: R.S. No. 31 


CAE workstations have 
80386 microprocessors 

Daisy Systems Corp. says that two 
members of its new line of CAE work¬ 
stations are the industry’s first CAE 
systems to use Intel’s 80386 microprocessor 
as their CPUs. 

One of the 80386-based workstations 
is the Personal Logician 386, a desktop, 
32-bit workstation that is compatible 
with the IBM PC AT. The other is the 
Logician 386, a 32-bit, accelerated 
graphics workstation. 

The third member of the workstation 
family is the Personal Logician 286. 

A base system ($50,000) for the Logi¬ 
cian 386 includes an 80386 CPU, a 
high-performance graphics accelerator, 
a 19-inch color monitor (1024 x 832 
pixels), an IBM PC AT-compatible 
floppy-disk drive, a 60M-byte quarter- 
inch cartridge tape drive, Ethernet 
LAN, an 85M-byte hard disk, and 4M 
bytes of memory. 

The base system ($20,000) for the 
Personal Logician 386 includes, besides 
Intel’s 80386 microprocessor, 2.5M 
bytes of RAM, a 53M-byte hard disk 
(44M bytes formatted), a 1.2M-byte 
floppy-disk drive, an 80287 coproces¬ 
sor, EGA graphics, and a 13-inch color 
monitor (640 x 350 pixels). 

A base system ($15,000) for the Per¬ 
sonal Logician 286 includes an 8M-Hz 
IBM PC AT with 2M bytes of RAM, a 
30M-byte hard disk, a 1.2M-byte floppy 
disk drive, EGA graphics, and a 13-inch 
color monitor (640 x 350 pixels). 

Workstations: R.S. No. 32 

Ashton-Tate software is 
compatible with new 
IBM PCs 

Ashton-Tate Corp. has announced 
that all of its major software products 
are available in 3.5-inch disk format for 
use on IBM’s Personal System/2. 

The company added that in addition 
to shipping its products in packages 
containing 5.25-inch formatted disk 
versions, it has begun shipping both 
3.5- and 5.25-inch disk versions of its 
software products in one package. 

The suggested list prices for the pack¬ 
ages containing both sets of disks are as 
follows: dBase III Plus and Framework 
II, $725; RapidFile, $420; MultiMate 
Advantage, $620; Chart-Master, $395; 
Diagram-Master, $365; Map-Master, 

$415; and Sign-Master, $265. 

3.5-inch software: R.S. No. 33 
Two-disk packs: R.S. No. 34 



DEC’S MicroVAX 2000 system is configured here with two 5.25-inch 71M-byte Win¬ 
chester disks and a 95M-byte cartridge tape. 


Computer offers MicroVAX II CPU capability at 
half the cost 


118 


COMPUTER 







DEC adds to VAX Integrated Publishing offerings 


Digital Equipment Corp. has 
extended its VAX Integrated Publishing 
offerings with the addition of the VAX- 
mate Publishing Solution for work¬ 
group applications and the VAX 
Departmental Publishing Solution for 
departmental applications. 

In VAX Integrated Publishing solu¬ 
tions, DEC combines its hardware, 
software, communications, and services 
with hardware and software supplied by 
its third-party vendors, with the goal of 
meeting customers’ specific publishing 
needs. 

VAX Integrated Publishing includes 
two solutions for work-group pub¬ 
lishing. 

The VAXstation Publishing Solution 
System consists of the 32-bit Digital 
VAXstation 2000 and TPS software 
(supplied by Interleaf, Inc.). To share 


central storage and peripherals, multi¬ 
ple VAXstations can be connected in a 
diskless environment with DEC’S VAX- 
cluster. 

A six-station configuration is priced 
at $154,060. 

The Ethernet-based VAXmate VIP 
Publishing System ($6670) consists of a 
VAXmate, MS-Windows, MS-Chart 
business graphics software, WPS-Plus 
word-processing software, and Aldus 
PageMaker desktop publishing software 
from Aldus Corp. With the addition of 
the desktop laser ScriptPrinter, which 
costs $6295, the system is priced at 
$12,190. 

Prices for VAX Departmental Pub¬ 
lishing Solutions begin at $41,691 and 
depend on the requested configuration. 

Solutions: R.S. No. 35 
Printer: R.S. No. 36 


Gould announces the first of 
a new family of minisupers 

Gould, Inc., has introduced NP1, the 
first in its new family of NPL (N-Processor 
Line) minisupers. 

There are 40 models of NP1—20 with 
basic CPUs and 20 that have optional 
arithmetic accelerators in addition to 
the basic CPUs. The accelerators, 
which increase MIPS ratings, are 
designed for scientific processing appli¬ 
cations. 

The largest of the systems, the 
arithmetic-accelerator-configured NP1 
480-SP ($2,900,000), offers 96 MIPS 
(320 MFLOPS), four buses, eight 
CPUs, and four billion bytes of physi¬ 
cal memory. 

The entry-level single-processor sys¬ 
tem, the Model 110 ($395,000), offers 
64M bytes of memory. 

The entry-level dual-processor sys¬ 
tem, the Model 120 ($595,000), offers 
two CPUs, 256M bytes of memory 
(expandable to up to 1024M bytes), one 
bus, and 20 MIPS. The arithmetic- 
accelerator-configured Model 120-SP 
($675,000) offers 24 MIPS, and the 
accelerator executes the built-in vector 
instructions of the Model 120 at a peak 
rate of 40 MFLOPS per processor. 

The vector instructions of the NP1 
are supported by an automatically vec¬ 
torizing Fortran 8X compiler, which 
searches for vectorizable code in a nor¬ 
mal Fortran program and generates the 
vector operations required without 
intervention from the programmer. 

R.S. No. 40 


Word processor includes command-customizing facility 


Borland International has introduced 
a word-processing program with a Soft 
User Interface that allows users to cus¬ 
tomize their set of word-processing 
commands from the keyboard without 
backtracking to a different menu. 

The company says that the product, 
Sprint: The Word Processor, features a 
very English-like command structure 
and file conversion for Wordstar, 
Wordstar 2000, WordPerfect Versions 
4.1 and 4.2, XyWrite Versions II and 
III, Multimate, and Multimate 
Advantage. 


Borland adds that Sprint: The Word 
Processor provides a full-blown macro 
language and that the word processor 
saves incrementally while the user types 
at the keyboard. The backup timer, 
Borland says, can be set to perform 
incremental saving as often as needed, 
without slowing down the system. 

The suggested retail price is $195. 
Delivery is scheduled to begin in the sec¬ 
ond half of 1987. 

R.S. No. 37 


Commercial system to provide intervendor network computing 


In the third quarter of 1987, Apollo 
Computer, Inc., will release what it says 
is the first commercially available set of 
distributed computing products for 
developing and running application 
programs across networks of multiven¬ 
dor computers. 

According to the company, the Net¬ 
work Computing System (NCS) is 
designed to distribute modules, or parts 
of a single application program, to 
those specialized computers best suited 
for each module’s task. Apollo adds 
that the system is also designed in such 
a way that it will automatically make 
productive use of idle computing 
resources on the network and distribute 
program modules concurrently. 

The NCS product includes three 
major components: 

• a remote procedure call (RPC) run¬ 
time environment that is transpar¬ 
ent to application programs and 


handles packaging, transmission, 
reception of data, and error correc¬ 
tion between the client and server 
subroutines; 

• a network interface definition com¬ 
piler, which compiles Apollo’s 
high-level Network Interface Defi¬ 
nition Language (NIDL) into a 
portable C-language source code 
that runs on both the client and 
server computers; and 

• a set of software tools that let 
applications determine at runtime 
which computers can provide the 
services they require. 

One version of NCS will be an open 
system written in portable C code 
(source licenses are available from 
Apollo); however, Apollo will also 
introduce NCS source code for Unix- 
based systems and for Digital Equip¬ 
ment Corp. VAX/VMS systems. 


The system uses low-level datagram 
services available on all networking pro¬ 
tocols, including TCP/IP (for Ethernet 
and DECnet), Apollo’s DDS (Domain 
Distributed Services), IBM SNA, and 
MAP/TOP. The NCS also comple¬ 
ments popular distributed file systems, 
such as NFS, RFS, and Apollo’s 
Domain, by providing the computa¬ 
tional sharing absent from those 
systems. 

The NIDL Compiler is priced at 
$1000 per node or $8500 per site; the 
NIDL Source Code, at $25,000 (on 
tape); the NCS Unix Runtime Source 
Code, at $1000; the NCS VAX/VMS 
Source Code, at $1000; and the Apollo- 
Specific NCS Documentation and Run¬ 
time Source Code, at $250. 


Compiler: R.S. No. 38 
Source codes: R.S. No. 39 
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ExperCommon Lisp II configured for Macintosh II 


PC version of Prime Medusa 
software released 

Prime Computer, Inc., has intro¬ 
duced Prime Medusa/pc software, a 
two-dimensional version of its Prime 
Medusa mechanical CAD software sys¬ 
tem. Prime Medusa/pc software can be 
used on an IBM PC AT operating 
within a Prime 50 Series minicomputer 
environment. 

Prime Medusa/pc is completely com¬ 
patible on the database level with Prime 
Medusa on a 50 Series system. It 
includes variational geometry and Super 
Syntax, a graphical programming lan¬ 
guage that permits a user to tailor CAD 
applications. 

Prime Medusa/pc costs $5000 per 
license, with monthly maintenance of 
$65 for a single license, updates and 
support. Quantity discounts and site 
licenses are available. 

Prime Computer, Inc., has also intro¬ 
duced Prime Medusa Revision 4.0, 
which includes an attribute-management- 
ystem feature for developing applica¬ 
tions, as well as enhancements to exist¬ 
ing capabilities. 

MidasPlus (Multiple Index Data 
Access Systems) software, a data- 
management system, is a prerequisite 
for using the attribute-management 
system. 

The price of Prime Medusa Revision 
4.0 was unavailable at the time of this 
writing. 

Prime Medusa/pc: R.S. No. 41 
Prime Medusa 4.0: R.S. No. 42 


Image-processing system uses 
dual AT/video bus design 

Comtal/3M has unveiled VisionLab 
II-T, a digital image-processing system 
that utilizes a proprietary dual AT/video 
bus design and is built around the 80286 
microprocessor. 

VisionLab II-T can be integrated into 
systems or used as a separate image- 
processing workstation. The standard 
configuration provides 640K bytes of 
RAM, expandable to 3M bytes, and an 
8-MHz clock rate. 

The AT bus controls interprocess 
communication, while the video bus is 
used for high-speed direct-memory- 
access operations between the system’s 
image-processing boards. The result is 
an image transfer rate of up to 36M 
pixels/sec. An Intel 80287 coprocessor 
is used for floating-point calculations. 

VisionLab II-T: R.S. No. 43 


ExperCommon Lisp II, a Lisp prod¬ 
uct designed by ExperTelligence, Inc., 
for desktop AI programming, has been 
configured for the Macintosh II. 

The price of the product is $1,195. 

ExperTelligence has also unveiled a 
new version of its ExperCommon Lisp 
AI language. The new version is specifi¬ 
cally designed for educational and 
instructional purposes and is priced at 
$195 for universities and students. 

ExperTelligence says that the new 
version, which is designated ExperCom¬ 
mon Lisp/Ed, is the first Common Lisp 
to sell below $200. The company adds 
that ExperCommon Lisp/Ed has over 
500 primitives and ExperCommon Lisp 
features such as defstructs, N-dimensional 
arrays, and a full complement of 
sequencers; an on-line symbolic debug¬ 
ger that identifies errors in process and 


MIPS Computer Systems, Inc., has 
announced the M/800 System, which, 
the company says, offers overall sus¬ 
tained performance of 8 MIPS and can 
be configured as a compute server, net¬ 
work disk- or fileserver, or as a multi¬ 
user system. 

The M/800 is designed to be a flexi¬ 
ble platform for OEMs building high- 
performance Unix Systems. 

The M/800 is based on the com¬ 
pany’s proprietary 32-bit RISC archi¬ 
tecture; its heart is the firm’s 125-MHz, 
R2600 CPU Board. The standard con¬ 
figuration for the M/800 System 
includes 8M bytes of main memory with 
error-correcting code, a 12-slot VME- 


Avatar Technologies, Inc., has intro¬ 
duced MacMainframe SE, a hardware/ 
software communications product that 
connects the Macintosh SE to IBM 
mainframes. 

MacMainframe SE provides 3278 ter¬ 
minal emulation and file transfer. It 
consists of a Macintosh software dis¬ 
kette and an internal card that plugs 
into the expansion port of the Macin¬ 
tosh SE computer. A coaxial connector 
extends from the Macintosh SE’s access 
panel to an IBM control unit or Display 
Printer Adapter via type A coax cable. 

MacMainframe SE is priced at $795. 

MacMainframe DX includes software 
features like binary transfer and text file 


allows the programmer to trace the pro¬ 
gram path back to the source, correct 
the defective value, and immediately 
resume execution of the program; and a 
class system with a full set of tools for 
object-oriented programming. 

ExperTelligence has also introduced 
educational-use versions of two other 
AI products: ExperLogo/Ed, at a price 
of $75 for students and educational 
institutions (the retail price is $150), and 
ExperLisp/Ed 1.5, at $149.95 for stu¬ 
dents and educational institutions (the 
retail price is $495). 

All of the products run on a Macin¬ 
tosh with 512K bytes of memory except 
ExperCommon Lisp/Ed, which requires 
a Mac Plus. 

ExperCommon Lisp II: R.S. No. 44 
/Ed products: R.S. No. 45 


bus cardcage, a 337M-byte disk drive, 
and a 60M-byte 1/4-inch tape drive. 

Also provided are support for Ether¬ 
net, TCP/IP, Sun Microsystem’s Net¬ 
work File System, and Sun’s PC/NFS. 

The M/800, which will be available in 
July, will sell to OEMs for $51,330 
(quantities of 10). 

MIPS Computer Systems has also 
announced the R2010 Floating Point 
Accelerator, a tightly coupled, syn¬ 
chronous coprocessor to the MIPS 
R2000 microprocessor. It provides 6 
MFLOPS peak performance and is 
priced at $1600 (quantities of 10). 

M/800: R.S. No. 46 
R2010: R.S. No. 47 


filtering. The binary transfer feature 
supports transfer of non-Macintosh 
binary files to and from IBM main¬ 
frames. When used with MS-DOS file 
transfer products such as Avatar’s 
PA100 Turbo or IBM’s PC3270 prod¬ 
uct, MacMainframe allows Lotus 1-2-3 
WKS files, IBM Document Content 
Architecture files, and Microsoft 
.SYLK files to be used by both IBM PC 
and Macintosh users. The binary trans¬ 
fer feature enables users of existing 
3270 communications networks to add 
micro-to-micro communications. 

The price of MacMainframe DX is 
$1195. 

MacMainframes: R.S. No. 48 


RISC-based building block offers 8-MIPS performance 


Products link Macintoshes with IBM mainframes 
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Products support integrated 
voice, data, and video 
networking 


Infotron Systems Corp. has intro¬ 
duced two Infostream network 
exchanges: the Infostream NX4600 and 
NX3000 Network Exchange products, 
which are designed for integrated voice, 
data and video networking. 

The NX4600 supports up to 96 high¬ 
speed links per node, or as many as 
4000 local channels. Up to 64 NX nodes 
can be interconnected within a single, 
integrated network. 

The NX3000 supports up to four 
links and 24 local channels. 

The NX products handle various link 
speeds simultaneously, including 56K 
bps, 64K bps, 256K bps, 1.544M bps 
and 2.048M bps. 

They support point-to-point, drop- 
and-insert, pass-through, ring, and full- 
mesh network topologies. They are 
compatible with Infostream mul¬ 
tiplexers and INX switching systems, 
enabling the construction of varying 
network configurations. 

The NX4600 base unit with full 
redundancy is priced at $20,000, and 
the NX3000 base unit costs $10,000. 

NX4600: R.S. No. 49 
NX3000: R.S. No. 50 

Kit adds up to 630M bytes of 
storage to Deskpro 386 
models 

ACS Telecom has announced its 
10-Disk/386 fileserver expansion kit for 
the Compaq Deskpro 386. 

The 10-Disk/386 system can add up 
to 630M bytes of hard-disk storage to 
the Deskpro 386 Models 40, 70, or 130. 
It features a lOM-bps data-transfer rate 
and 1:1 interleave. 

The expansion kit also uses up to 8M 
bytes of high-speed, 32-bit, static-column 
RAM to cache most disk-read requests, 
with an intelligent most-frequently-used 
algorithm, for peak performance. The 
system is completely compatible with 
the built-in Compaq 386 disk control¬ 
ler, for a combined total of up to 760M 
bytes. 

ACS says that the 10-disk/386 expan¬ 
sion kit is compatible with every leading 
local area network system, including 
ACS Telecom’s 10-CAD, 10-Net from 
Fox Research, 3Com, IBM’s Token 
Ring, and ARCnet. 

Prices for the 10-Disk/386 kit start at 
$5595. 


10-Disk/386: R.S. No. 51 


Integrated office system handles data over Ethernet 


Eastman Kodak Co. has announced 
the KIMS System 5000, a networked 
information-management system that 
captures, stores, manipulates, and 
delivers information and images at mul¬ 
tifunction workstations by means of 
Ethernet LAN. 

The KIMS System 5000 can accept 
data from mainframes and other net¬ 
works through gateways and protocols 
that include IBM’s Systems Network 
Architecture. 

Kodak has also introduced other 
members of the KIMS family: the Sys¬ 
tem 3000, a stand-alone system for 
managing image-intensive information, 
and the System 4500, a microfilm-based 
computer-assisted retrieval system. 

The version of the KIMS System 5000 
that incorporates 12-inch optical disks 
provides data storage and retrieval 
through a disk library that has a capac¬ 
ity of up to 121, 12-inch optical disks, 
each storing 2.6G bytes. 

The film-based KIMS System 5000 
uses a robotic film library that holds up 
to 372 rolls of microfilm. 

The cost will depend on the storage 
medium chosen and on the hardware 
configuration, but the multiuser system 


will be priced at approximately 
$700,000. 

Optical disk-based, the KIMS System 
3000 includes a multiserver linked to 
one or more large format workstations 
with multiple-windowing capability. It 
also features a laser scanner, applica¬ 
tions software, a user interface, and on¬ 
line access of up to 2G bytes of infor¬ 
mation. 

The KIMS 3000, which is designed to 
be upgradable to a KIMS 5000, will be 
priced at approximately $150,000. 

The KIMS 4500 is a start-up choice 
for persons who want to build and 
access databases today and anticipate 
upgrading to the film-based KIMS sys¬ 
tem 5000 in the future. (The KIMS sys¬ 
tem 4500 uses the same hardware and 
software as the KIMS System 5000, 
including DEC’S MicroVAX II com¬ 
puter, a Kodak IMT-350 microimage 
terminal, and a Kodak Reliant intelli¬ 
gent microfilmer 2000.) 

The KIMS 4500 will be priced at 
approximately $150,000. 

Delivery of all three KIMS systems is 
scheduled to begin in mid-1987. 

KIMS family: R.S. No. 52 


FREE! 

Work better with your 
computer! Catalog of 
over 2,700 support 
products shows how. 

Timesavers, spacesavers, money- 
savers, sanity savers! Find them all each 
month in the Inmac catalog. 

See flexible disks to furniture, print 
spoolers to plotter supplies, cables to 
cleaning kits—over 200 pages of quality 
products! And each is photographed 
and explained so clearly, even begin¬ 
ners learn how to buy wisely. 

You get timesaving hints, technical 
help by phone, same-day shipment, free 
trials, 1-year-to-lifetime guarantees, and 
Inmac's 10 years of experience. 


CALL TOLL-FREE TODAY! 
1-800-547-5444 

Or fill out & mail the coupon below 

Get the most from your 
computer... subscribe 
to the Inmac catalog! 
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Kaypro introduces complete desktop publishing system 


Kaypro says that its Extra! Extra! is 
the market’s first complete turnkey 
desktop publishing system. 

The menu-driven product includes 
the Kaypro 286i Model C with 640K 
bytes of RAM and 30M-byte hard 
drive, EGA graphics, EGA mono¬ 
chrome monitor, the 300 DPI Kaypro 
Page Printer II, Hewlett-Packard “B” 
fonts, an assortment of downloadable 
fonts, and all interface cabling. 

It also has a three-button mouse with 


Supermini is based on 
incremental architecture 

NCR has unveiled the 32-bit NCR 
Tower 32/800, a multiuser super¬ 
minicomputer based on NCR’s imple¬ 
mentation of the MC68020 microproces¬ 
sor and an NCR version of Unix. 

The application processor executes 
user applications and non-I/O func¬ 
tions within the system. As many as 
four can be installed, each with 4M 
bytes, 8M bytes, or 16M bytes of system 
memory, to provide up to 64M bytes of 
total system memory. 

The file processor, which has 1M byte 
of memory, is designed to handle 
system-file I/O. Up to four file proces¬ 
sors can be installed. Integrated disk 
capacity of the Tower 32/800 can range 
from 170M bytes to 850M bytes. 

The terminal processor offloads the 
Unix TTY subsystem and TTY driver 
from the application processor. Each 
terminal processor has 1M byte of 
memory and connects up to eight termi¬ 
nals/printers. The system can be con¬ 
figured to provide up to 128 
connections. 

The communication processor has 
1M byte of memory; it services commu¬ 
nications between the system and wide 
area networks. Up to four communica¬ 
tions processors, supporting two lines 
each, can be used. 

The local area network processor, 
with 1.25M bytes of memory, is 
designed to manage an Ethernet chan¬ 
nel and run TowerNet system software. 

All the system processors are Multi¬ 
bus II modules that connect directly to 
the Multibus II system bus. 

The suggested list price for a configu¬ 
ration that includes two application 
processors with 4M bytes of memory 
each, 340M bytes of integrated fixed- 
disk capacity, and the operating system 
is $127,000. 

R.S. No. 54 


paint software, Ventura Desktop Pub¬ 
lishing Software, WordStar 4.0, form 
generation software, MS-DOS 3.2, GW 
Basic, MailMerge, and CorrectStar. 

Extra! Extra! can combine text 
directly from various word processors 
and graphics from Lotus 1-2-3, 
AutoCAD, GEM, scanned photo¬ 
graphs, all ASCII files, and self-generated 
art. 

Extra! Extra! is priced at $8495. 

R.S. No. 53 


Optical disk system can store 
a trillion bytes of memory 

Eastman Kodak’s optical disk System 
6800 can store up to one terabyte of 
information. 

The system is a write-once/read- 
many-times type that comprises media, 
drive, controller, and interface. All 
components can be housed in the Sys¬ 
tem 6800 automated disk library, which 
can accommodate up to 150, 14-inch 
disks. 

Each disk provides 6.8G bytes of ran¬ 
domly accessible on-line storage and 
carries 14,000 tracks per inch. 

The drive records digital information 
at 21,000 bpi in fixed sectors of 1024 
bytes. Transfer rate is 1M byte per sec¬ 
ond, and average access time within a 
band is 100 milliseconds. 

The system, which is now in beta test 
phase, will be marketed through OEMs. 
Prices are scheduled to be announced in 
June. 

R.S. No. 55 


Douglas CAM Exchange, a program 
offered by California-based board man¬ 
ufacturer Douglas Electronics, allows 
customers across the US to transfer 
design files over a modem to the com¬ 
pany’s facility so that the files can be 
manufactured into circuit boards. 

A Hayes-compatible, 1200-baud 
modem is required, and design files 
must be created on the Douglas CAD/ 
CAM System. Douglas CAD/CAM is a 
software package designed by Douglas 
Electronics to integrate board layout 
and manufacturing into one system. 

Douglas CAM Exchange allows the 
customer to obtain board specifica¬ 
tions, receive a price quote, place an 
order, and transmit design files. Boards 


Compaq offers 12-MHz version 
of Deskpro 286 

Compaq Computer Corp. has 
replaced its 8-MHz Deskpro 286 with a 
model that it says is up to 50 percent 
faster than other 80286-based PCs. 

The new Deskpro 286 is available in 
three models. 

All are capable of running at either 8 
MHz or 12 MHz and offer five 8/16-bit 
and two 8-bit, full-height, industry- 
standard expansion slots; parallel 
printer interface; asynchronous commu¬ 
nications interface; and 5.25-inch, 
1.2M-byte diskette drive. All of them 
can be fitted with an optional 8-MHz 
80287 coprocessor and a 360K-byte dis¬ 
kette drive. 

Model 1 ($2999) offers 256K bytes of 
RAM. 

Model 20 ($3999) provides 640K 
bytes of RAM and a shock-mounted 
20M-byte fixed-disk drive. 

Model 40 ($4999) includes 640K bytes 
of RAM and a shock-mounted 40M- 
byte fixed-disk drive. 

12-MHz Deskpro: R.S. No. 56 
80386-based PC supports 
up to 4G bytes of RAM 

PC Discount says that its Noble 386, 
which has a Norton SI rating 18 times 
that of an IBM PC, is the fastest per¬ 
sonal computer on the market. 

The Noble 386 has an Intel 80386 
microprocessor, a 40M-byte hard disk 
drive, a 1.2M-byte floppy disk drive, 
and a 101-key enhanced keyboard. 

It also features an 80387 math 
coprocessor slot plus eight expansion 
slots and 1M byte of RAM that can be 
expanded to 4G bytes. 

The complete unit is priced at $3999. 

R.S. No. 57 


and/or artwork are generated by 
Douglas Electronics directly from the 
files sent over the modem. 

The Douglas CAD/CAM System 
runs on the Macintosh. Simple and 
complex boards up to 32 inches x 32 
inches can be created. The system has 
multilayer capabilities, including silk- 
screen and solder mask, and supports 
surface mount devices. There are 50 
selectable view sizes. All grids, pads, 
holes, and traces are completely user 
definable. 

The Basic system sells for $95. 


CAM Exchange: R.S. No. 58 
CAD/CAM System: R.S. No. 59 


System allows Macintosh users to design and create PCBs 
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Editor: Sallie Sheppard, Dept, of Computer Science, Texas A&M University, College Station, TX 77843; (409) 845-5466 

Technical Activities Board plans for technology of the 1990s 

Ken Anderson, Second Vice President and TAB Chair 


The Technical Activities Board is 
characterized by many as the “back¬ 
bone” of the Computer Society. TAB is 
the resource that provides leadership 
and support for technical and standards 
activities. The focus is on maintaining 
the technical vitality of society members 
in order to enable them to advance the 
theory, design, and application of com¬ 
puter engineering, science, and technol¬ 
ogy. The mechanism used by TAB in 
meeting this goal is to provide opportu- _ 
nities for participation in various techni¬ 
cal activities by society members. 

Organizationally the TAB is made up 
of the C chairpersons of the 33 technical 
committees, the chairperson of the TAB 
Operations Committee, the TAB vice¬ 
chairpersons, and the chairpersons of 
task forces and special committees. The 
technical committees have been catego¬ 
rized into three groups — hardware, 
software, and applications — each coor¬ 
dinated by a TAB vice-chairperson. 

Each TC concentrates on a particular 
specialty area, thus fostering a banding 
together of society members with com¬ 
mon technical interests. The TC sup¬ 
ports its members by providing 
newsletters, conference technical ses¬ 
sions, workshops on new technology, 
and other activities. The emphasis is on 
action and member involvement. 

The theme for TAB this year is “TAB 
— Planning for the Technology of the 
Nineteen Nineties, Together We Can Do 
It But To Fail To Plan Is A Plan To 
Fail.” In keeping with this theme, 


several new subcommittees and task 
forces have been established within TAB 
— all of which are open for more par¬ 
ticipation from society members. These 
new committees include 

(1) Six task forces to stimulate inter¬ 
disciplinary activity and communication 
among technical committees in meeting 
the challenges of new technology areas: 

• Evaluation and Selection Criteria 
for Computer Hardware and 
Software 

• Supercomputing Hardware, Soft¬ 
ware and Applications to Geophysi¬ 
cal Problem Solving 

• Satellite TV and Televideo Technol¬ 
ogy for a Delivery System for Tech¬ 
nical, Administrative and Career 
Development Meetings 

• Computers in Manufacturing 

• Decision Support Systems 

• Computers and Artificial Intel¬ 
ligence. 

(2) A TAB Marketing and Planning 
Committee, which is currently carrying 
out a self-assessment of TAB to be used 
as a basis for future planning. 

(3) An International Relations Com¬ 
mittee to facilitate our activities in 
Europe, the Far East, South America, 
and other areas outside the continental 
US. 

(4) A TAB Board of Governors Advi¬ 
sory Committee to foster better commu¬ 
nication between Computer Society 
decision makers and TC leadership. 

(5) A TAB Standards Activity Board 
Liaison Committee to study our proce¬ 


dures and guidelines for standards 
development and to help in formulating 
a consistent standards development pro¬ 
cedure. 

Each of these committees and task 
forces have been established to focus on 
providing avenues for interested Society 
members to impact evolving technology. 

To provide you with technical infor¬ 
mation and news originating from TAB, 
a vice-chairperson for publications and 
communications has been established. 

In this role, Sallie Sheppard has 
assumed leadership in establishing a 
relationship with the Computer Society 
Publications Board and the staff of 
Computer. With the cooperation of 
Computer Editor-in-Chief Bruce Shriver, 
she will coordinate items in this depart¬ 
ment each month to help you stay 
abreast of technical activities and 
opportunities for involvement. 

We invite you to join us so that 
together we can plan for the technology 
of the 1990s. 


Multiple-Valued Logic 
Technical Committee 

Jon Butler, Naval Postgraduate School 

The question of whether analog or 
digital circuits are more appropriate for 
computing has been settled for some 
time now. However, because of prob¬ 
lems inherent in VLSI, this question is 
being examined again. While most semi¬ 
conductor devices have two stable states, 
making binary the best choice with 
respect to device complexity, the inter¬ 
connect is not effectively used with 
binary signals. For example, a two¬ 
valued line carries only half the infor¬ 
mation of a four-valued line. In VLSI, 
70 percent of the chip area is devoted to 
interconnect, 20 percent to insulation, 
and only 10 percent to devices. Thus, 
there is considerable interest in technol¬ 
ogies which support multiple values. 
Technologies in which there is consider¬ 
able research in multi-valued logic 
include MOS and CCD. The advent of 


The goal of the new Computer Society News Department is to keep you 
informed of the society's varied activities. It will include reports from techni¬ 
cal committees, observations of major trends and advances, announcement 
of opportunities for involvement with committees and task forces, and news 
of technical studies and public policy issues. A major focus will be the activi¬ 
ties of the technical committees, which report through the Technical Activi¬ 
ties Board. This month, TAB Chair Ken Anderson gives an overview of the 
group, and one of the TCs is highlighted. 

It is my hope that this column will provide a clearinghouse of information 
and opportunities for involvement, thereby allowing you to take advantage of 
the efforts of others and to hear of opportunities for greater participation on 
your part. 

—Sallie Sheppard 
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optical computing will almost certainly 
involve the use of more than two values 
of logic, and this is presently receiving 
considerable attention. 

Commercially available ICs now 
incorporate memories that operate on 
four levels of logic. This includes a 
numeric coprocessor and certain semi¬ 
custom designs. Prototype PLAs, 

ROMs, arithmetic units, and signal 
processors have been produced which 
operate on more than two levels of logic. 
Image processing in which various gray 
levels are encoded directly into logic 
levels is being investigated, and circuits 
that take advantage of “encoderless” 
processing have been built. Because 
there is a significant increase in the 
complexity of logic operations as radix 
increases, there are important logic 
design problems, including, for example, 
optimal PLA operations and knowledge- 
based circuit design. The use of multi¬ 
valued logic impinges on system design 
issues as well, including arithmetic oper¬ 
ations, microprogramming, and system 
reliability. Multi-valued logic concepts 
have been helpful in the design and 
analysis of binary systems, for example, 
in binary PLA minimization techniques. 

Of particular interest recently are 
spectral transform techniques in which 
logic functions are given alternate 
representations so that certain aspects 
of analysis and design can be more read¬ 
ily accomplished. Spectral methods have 
been useful in the design of circuits, dig¬ 
ital testing, and pattern recognition. The 
activity in this area has been reported in 
two special workshops and two Academic 
Press texts. 

An important area of research is 
multi-valued algebras. There are, for 
example, important questions regarding 
the completeness of operations. That is, 
a complete set of operations is capable 
of realizing all possible functions. Com¬ 
pleteness questions have been answered 
satisfactorily in binary (1941), ternary 
(1954), and all other radices (1965). This 
has practical implications, since it 
affects which hardware realizations 
should be implemented. One recent 
result is a completeness criterion for 
partially specified operations involving 
more conditions; for three-valued logic, 
for example, 18 conditions must be 
check to determine completeness for 
totally specified functions and 58 for 
partially specified functions. 

A means to interface uncertain real 
world parameters with precise input 
specifications of digital computers is 
fuzzy logic. Unlike conventional 
multiple-valued variables, fuzzy varia¬ 
bles take on a continuum of values. 
Within expert systems, fuzzy logic is the 
basis for the management of uncer- 
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tainty. For example, fuzzy logic is an 
integral part of a working medical- 
diagnosis expert system. Fuzzy logic 
integrated circuits have been developed 
that achieve 80,000 fuzzy-logic infer¬ 
ences per second, which is about 10,000 
times faster than conventional fuzzy- 
logic systems. 

Important strides have been made in 
the application of fuzzy logic to control. 
Fuzzy logic has been incorporated in 
cement kiln regulators, subway train 
controllers, train loading controllers, 
and automobile engine controllers. 
Originally a theoretical subject, fuzzy 
logic has matured to a field where theo¬ 
retical as well as practical advances have 
contributed to computer science and 
engineering. 

The Multiple-Valued Logic Technical 
Committee, formally established in 
1980, has the charter to advance the 
state of knowledge in this important 
area. Its members reflect an interna¬ 
tional interest. Besides many US mem¬ 
bers, there are researchers in Canada, 
Japan, China, England, Holland, 
France, and Germany. A significant 
number of the advances, both practical 
and theoretical, are being made outside 
the US, and the MVL-TC has made it 
possible for these researchers to exchange 
ideas and engage in cooperative 
research. Collaborative research efforts 
have been conducted between England- 


Canada, US-Holland, France-Canada, 
US-Canada, Germany-Japan, Canada- 
China, Canada-Japan, and others. 

The MVL-TC, one of the most active 
technical committees in the Technical 
Activities Board, holds an annual sym¬ 
posium publishes a newsletter, and 
sponsors other activities such as work¬ 
shops and special issues of journals. 

The 17th International Symposium 
on Multiple-Valued Logic will be held 
May 26-28 on the Boston Campus of 
the University of Massachusetts. Papers 
will cover logic design, device imple¬ 
mentation, applications to systems relia¬ 
bility, and formal logic. Invited speakers 
include R. McKenzie and L. Zadeh of 
the University of California, Berkeley, 
and H. Rasiowa of Warsaw University. 
For further information, contact Dan 
Simovici, Department of Mathematics 
and Computer Science, University of 
Massachusetts, Boston, MA 02025; 

(617) 929-7966. 

For further information on the MVL- 
TC, contact the chairperson, Michel 
Israel, at the Department of Electrical 
Engineering, University of Toronto, 
Toronto, Ontario, Canada, M5S 1A4; 
home phone, (416) 482-9942; office 
phone, 416-978-2402. After July, contact 
Israel at IIE-CNAM, 18 Allee Jean 
Rostand, BP 77 91002 Evry Cedex, 
France; office phone, 1 (6) 077-9740. 


Technical committee chairs 


Computational Medicine: John Long, Univer¬ 
sity of Minnesota, 2829 University Ave., Suite 
408, Minneapolis, MN 55414; (612) 627-4850 
Computer Architecture: George Michael, 
Lawrence Livermore Laboratory, PO Box 
808, L-306, Livermore, CA 94550; (415) 
422-4239 

Computer Communications: William Living¬ 
ston, Vance Systems, 901-U Bonanza Blvd., 
Chantilly, VA 22021; (703) 481-0990 
Computer and Display Ergonomics: James 
Greeson, IBM Corporation, H29/Bldg. 061, 
Research Triangle Pk., NC 27709; (919) 
543-6655 

Computer Elements: Robert Sullivan, Honey¬ 
well Information Systems, MS 176, PO Box 
8000, Phoenix, AZ 85066; (602) 862-5846 
Computer Graphics: Lawrence Rosenblum, 
Naval Research Laboratory, Code 5170, 4555 
Overlook Ave. SW, Washington, DC 20375; 
(202) 767-3743, 2384 


Computer Languages: Pei Hsia, 2100 Oak 
Bluff Dr., Arlington, TX 76001; 
(817)273-3785 

Computer Packaging: Eugene Shapiro, 

IBM T.J. Watson Research Ctr., 13-209, PO 
Box 218, Yorktown Heights, NY 10598; (914) 
945-1711 

Computers in Education: Alfred Bork, Univer¬ 
sity of California, Educational Tech. Ctr., 
Irvine, CA 92715; (714) 856-6943/7043 
Computing and the Handicapped: Elmer 
Hoyer, Wichita State University, EE Dept., 
Box 44, Wichita, KS 67208; (316) 689-3415 
Design Automation: Sumit Dasgupta, 5 Drew 
Court, Wappingers Falls, NY 12590; (914) 
432-5046. 

Database Engineering: Sushil Jajodia, Naval 
Research Laboratory, Code 55-94, 
Washington, DC 20375-5000; (202) 767-3596 
Distributed Processing: Carl Davis, University 
of Alabama, Huntsville, AL 35899; (205) 
895-6088/6653 
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Committee on Public Policy: 

Briefing on Section 1706 of the 1986 US Tax Law 


Ralph J. Preiss, COPP Vice-Chair 

When the 1986 Tax Law was being 
revised by the joint-House/Senate com¬ 
mittee, presumably with the advice and 
consent of the bureaucrats of the Inter¬ 
nal Revenue Service, a section was left in 
the law that singles out programmers, 
system analysts, engineers, and scientists 
who do consulting work. They must 
now prove their independent contractor 
status to the satisfaction of the IRS for 
tax purposes. This is Section 1706, 
added as an amendment by Sen. Daniel 
Moynihan (D-NY), and accepted with¬ 
out hearings or debate by voice-vote on 
June 20, 1986, to “provide greater cer¬ 
tainty and simplification in employment 
tax law and result in greater com¬ 
pliance.” 

The section reads rather innocuously 
as follows: 

“TREATMENT OF CERTAIN TECHNI¬ 
CAL PERSONNEL, (a) In General- 
Section 530 of the Revenue Act of 1978 is 
amended by adding at the end thereof the 
following new subsection: ‘(d) Exception. 
— This section shall not apply in the case 
of an individual who, pursuant to an 
arrangement between the taxpayer and 


another person, provides services to such 
other person as an engineer, designer, 
drafter, computer programmer, systems 
analyst, or other similarly skilled worker 
engaged in a similar line of work.’ (b) 
Effective Date. — The amendment made 
by this section shall apply to remuneration 
paid and services rendered after December 
31, 1986.” 

Section 530 of the Revenue Act of 
1978 turns out to be a “safe harbor” 
provision which explicitly protects cer¬ 
tain taxpayers who in the past had a 
reasonable basis (such as through past 
industry practice) for not treating wor¬ 
kers as employees to continue doing so. 
The groups which were protected as 
independent contractors included 
fishermen, lumberjacks, barbers, and 
of course those people in the Computer 
Society who do consulting work. 

Software Consultant Brokers Associ¬ 
ation, representing firms that broker 
independent high-tech engineering and 
computer consultants, reports that the 
National Technical Services Associa¬ 
tion, a trade group representing con¬ 
sulting organizations who employ their 
technical staff, successfully lobbied for 
Section 1706 as a revenue-producing 
provision in order “to stamp out 


independent contractors through legis¬ 
lative means what they cannot achieve 
in the marketplace... a competitive edge 
over the independent contractor mar¬ 
ket.” The SCBA statement goes further 
to say, “Responsible trade organiza¬ 
tions had no advance knowledge of any 
hearings held for the amendment; no 
known opportunities existed for feed¬ 
back from technical spokespeople of 
this country...” before the law was 
enacted. COPP, was contacted by 
David Hicks, SCBA president, inviting 
the Computer Society to take a stand 
against “this discriminatory practice.” 

COPP investigation discovered the 
following: 

(1) According to the literature sup¬ 
plied by the SCBA, Section 1706 had no 
hearings before it was slipped into the 
law. 

(2) For years the IRS has been trying 
to tighten its grip at the source on tax¬ 
payers reporting income. In a study 
based on 1976 revenues, the IRS claims 
that independent contractors formed a 
segment of 39 to 44 percent of a popu¬ 
lation that is responsible for under¬ 
reporting $75- to $ 100-billion income. 
This was estimated as resulting in a tax 
loss of $12.8 to $17.1 billion. By sin- 
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gling out the professions represented by 
the membership of the Computer Soci¬ 
ety, the IRS hopes to put a dent into the 
under-reporting and generate an addi¬ 
tionally $60 million over the next five 
years. (It is not clear whether this 
should be classified as “closing a loop¬ 
hole” or whether it is brazenly dis¬ 
criminatory.) 

(3) In 1978 Congress decided to study 
the problem of defining who is an 
employee. In the meantime it enacted a 
“safe harbor” provision which stated 
that the IRS could not arbitrarily define 
people as employees if they have in the 
past been considered as independent 
contractors. The problem is that since 
that time, Congress never actually got 
around to studying the employee defini¬ 
tion problem, and thus the “safe har¬ 
bor” provision remained in effect. 
Section 1706 deliberately removed the 
“safe harbor” provision from our 
profession without providing further 
definitions or guidelines. 

(4) Without further definition or 
clarification, it is possible that a client 
who retains a consultant whom he 
thinks is an “independent contractor” 


can incur a tax liability plus a fine 
should the IRS later rule that the person 
is an “employee.” The liability might 
include the employer FICA contribu¬ 
tions, federal and state tax withhold¬ 
ings, federal unemployment taxes and 
workers compensation insurance 
premiums, and maybe even employee 
benefits, such as vacation pay and med¬ 
ical coverage. 

(5) The IEEE United States Activities 
Board has already retained counsel to 
provide interpretation of Section 1706, 
applicability, and the possibility of 
changing the law. It also has created an 
entity position statement printed in the 
March issue of The Institute. 

(6) On January 29, Rep. Judd Gregg 
(R-NH) introduced HR.792 to postpone 
the effective date of the section from 
January 1, 1987 to January 1, 1989, 
thereby providing two more years for 
Congress to decide its intent. The identi¬ 
cal Senate legislation (S.429) was intro¬ 
duced by Senators Dave Durenberg 
(R-MN) and Arlen Specter (R-PA). On 
February 5 Senators A1 D’Amato (R- 
NY) and Christopher Dodd (D-CT) 


introduced S.491 to repeal Section 1706 
altogether. 

(7) President Reagan has said: 
“America’s competitiveness in world 
markets is critical to maintaining and 
expanding our standard of living and 
the national security. I have established 
a national goal of assuring American 
pre-preeminence into the 21st century. 
Achieving that goal is the responsibility 
of all Americans.” Maybe if he knew of 
the anti-competitive ruling that section 
1706 represents, he might issue an execu¬ 
tive order to the IRS clarifying the 
independent contractor vs. employee 
status until Congress makes up its mind. 

I recommend that concerned Com¬ 
puter Society members express to their 
elected representatives their feelings 
about being singled out as tax dodgers. 
Our tax system is based on voluntary 
compliance, and our judicial system is 
based on the accused being considered 
innocent until proven guilty. Our Con¬ 
gress forgot these pillars of the Ameri¬ 
can system in its haste to get the tax 
reform act on the books. Let us remind 
them with our letters and mailgrams. 
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UPDATE 


While DP salaries rise, AI turmoil forecast 


America’s data processors can look 
forward to receiving record starting sala¬ 
ries in 1987, according to a study con¬ 
ducted by Robert Half International of 
New York City. This annual study, 
based on comprehensive analysis of 
thousands of position requests submit¬ 
ted by employers throughout the US, 
serves many corporations as a salary 
review guide. 

The study found starting salaries up 
an average of five percent over 1986, 
and covered the major data-processing 
categories—including programmers, sys¬ 
tems analysts, computer operators, data¬ 
base administrators, EDP auditors, and 
management information system direc¬ 
tors. Here are some highlights: 

Project managers at large installa¬ 
tions should find starting salaries in the 
$38-48k range—a 6.8-percent increase 
over 1986. 

Programmers working at medium- 
size installations should find starting 
salaries in the $22-29k range—up 8.5 
percent over last year. 

Systems analysts at small installations 
can expect starting salaries in the 
$26.5-34k range—a 2.5-percent gain. 

Computer operators at large installa¬ 
tions will start in the $19-24.5k range- 
up 8.8 percent. 

MIS Directors at medium-size instal¬ 
lations can anticipate starting salaries 
ranging from $45k to $58k—up 3 per¬ 
cent over 1986. 

These salaries are national averages, 
and geographic variances should be 
applied to all data-processing starting 
salaries under $50k. For example, start¬ 
ing salaries in Connecticut, California, 
and Massachusetts are 5-percent higher 
than the national average—but salaries 
are 9- to 10-percent lower in Arizona, 
Florida, and Maine. For reasons not 
hard to comprehend, starting salaries 
average 20-percent lower in Hawaii— 
and 18-percent higher in Alaska. For 
positions located in cities with over one 
million in population, 5 percent should 
be added to the prevailing starting salary. 

Unfortunately, the fast-growing AI job 
market will undergo turmoil during the 
year ahead. According to Herb Halbrecht, 
president of Halbrecht Associates (Stam¬ 
ford, Connecticut), an unprecedented 
number of senior and middle-level people 
will change jobs. “Increased merger 


activity, the growing maturity of the indus¬ 
try, and the number of new people who 
have been recently trained in AI are all fac¬ 
tors affecting the AI job market. 

“As a result of recent or potential 
mergers, small Al/expert system groups at 
certain corporations may be combined or 
eliminated. And that means that some peo¬ 
ple will be pushed into the job market,” 
Halbrecht predicts. “At the same time, a 
number of AI firms themselves are likely 
to merge or seek to be acquired, while some 
major corporations will play catch-up in 
AI or seek to hire away small groups of AI 
people or even entire expert systems com¬ 
panies. Such factors alone would portend 
an active year in any job market.” 

But 1987 may also bring a long-expected 
shakeout in the expert systems industry. 
“The honeymoon is over in AI and expert 
systems,” Halbrecht says. “Many ventures 
are beginning to feel pressure from inves¬ 
tors to produce some profits, suggesting 
that staff cutbacks may not be far off.” 

Specialist in AI recruitment Daryl 
Furno, Halbrecht’s vice president, fore¬ 
casts more immediate problems facing the 
pioneering AI start-ups of the early 1980s. 
“Many had four-year vesting programs 
and now they’ll soon be four years old. We 
already sense that many good people, 
including founders, will depart for new 
ventures or other companies. Some will 
look around in the job market and others 
will start up completely new ventures, ” she 
said. “And after four years of total dedi¬ 
cation, still others will seek out the pre¬ 
sumed stability and normality of the 
nine-to-five job.” 

Nevertheless, Furno still sees the overall 
AI job market as strong and even expand¬ 
ing: “The big challenge in AI recruitment 
is that there just are not enough qualified 
people nationwide, and new specialties 
seem to emerge every year. In 1984-85, it 
was expert systems. And last year, it was the 
knowledge engineers who help develop 
expert system applications. In 1987, we 
foresee growing interest in those who can 
apply expert systems to computer- 
integrated manufacturing.” 

In addition to expert system applica¬ 
tions, Furno thinks the next AI specialty 
will be “neural networking.” 

For a free copy of the complete research 
survey, contact AI Consultants, 1294 Mar¬ 
ble Hill, Lake Zurich, IL 60047; or call Joe 
Megna at (312) 438-8581. 


Soviet “Star Wars”? 

Kathleen O’Toole, Stanford University 

Did the Soviets begin their own Star- 
Wars-related research in the 1960s? And 
now, in response to US development, are 
they gearing up for a more costly scien¬ 
tific effort? The recent Soviet announce¬ 
ment canceling development of a large 
irrigation project, and the appointment 
of a computer scientist to head the 
Soviet Academy of Sciences might be 
clues that Mikhail Gorbachev is reor¬ 
ganizing Russian scientific and eco¬ 
nomic goals to meet the Star Wars 
challenge, according to Mark Kuchment— 
a Russian emigre and visiting scholar at 
Harvard’s Russian Research Center. 

Gorbachev’s decision to end Andrei 
Sakharov’s exile may also have been cal¬ 
culated, Kuchment said, to “bolster the 
morale of scientists” as part of an effort 
“to mobilize scientists on a broad 
front.” Russian scientists had viewed 
Sakharov’s exile as “an insult to the 
whole profession.” 

Obtaining technology from the West 
is more difficult than during the 60s and 
70s, Kuchment continued, so the Soviets 
must mount a massive scientific effort if 
they want to compete with America’s 
strategic defense initiative. Computer 
science is one of their weak points, and 
appointing a computer scientist to head 
the Soviet academy could indicate new 
emphasis on that type of research. 

While the Soviets blame America for 
this new defense phase of the nuclear 
arms race, Kuchment said he recently 
obtained new information suggesting 
that “in a way, the Soviets helped to put 
this train of thought and development in 
motion. In a way, they have themselves 
to blame.” 

Kuchment’s new information came 
from “several Soviet emigrants who par¬ 
ticipated in Russian Star-Wars-related 
research. What those people are telling 
us is that the Soviets embarked on the 
development of several Star-Wars-related 
technologies quite independently of the 
United States.” 

As an example, Kuchment cited an 
account provided by the Russian 
developer of a magnetron—a powerful 
generator of shortwave pulses used in 
land-based radar. That developer 
claimed that in 1959, prior to the sign¬ 
ing of SALT 1 limiting antiballistic mis- 
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sile development, Russia’s Military 
Industrial Commission decided to go 
with the cheaper of two proposed 
antiballistic missile systems. This deci¬ 
sion led to the ABM ring around 
Moscow, the world’s only only ABM 
defense. The developer said the magne¬ 
tron he designed for that system was fin¬ 
ished and tested by 1965 when he was 
“approached by a major Soviet 
developer of ABM systems to breed a 
new, much more powerful (10 times 
more powerful) magnetron amplifier. 
The implication was that this would 
become part of phased air raid radar 
systems. 

“Our speaker started to work on the 
project but later gave this assignment to 
his former student who inherited a 
laboratory of 450 people and worked on 
this project for more than 10 years,” 
Kuchment went on. When asked if 
SALT negotiations slowed his research, 
the Russian scientist laughed: “No way. 
It [the research] was only encouraged.” 

“When the developer left Russia in 
the 70s, the project was halfway com¬ 
pleted,” Kuchment said. “He believes 
that was the beginning of the develop¬ 
ment of a group of radar installations, 
one of which now appears in Kras¬ 
noyarsk and is the cause of great con¬ 
cern in this country. There are big 
discussions whether [the Krasnoyarsk 
installation] is part of the Soviet 
national ABM system, whether its a vio¬ 
lation of the SALT treaty, or whether 
it’s a chance encounter. What we got 
from this speaker is [that] it doesn’t 
seem to be a chance encounter. It seems 
to be a . . . very deliberate technological 
development that wasn’t influenced at 
all by SALT negotiations.” 

The Krasnoyarsk radar technology 
does not violate the SALT treaty but its 
location does, according to Greg Dalton 
of Global Outlook Inc.—a Palo Alto 
consulting firm involved with Stanford 
University in an 18-month study of 
SALT treaty compliance. In signing the 
ABM treaty, both countries recognized 
the need for ballistic missile early- 
warning radar. Since they can be used to 
detect and track warheads, both coun¬ 
tries agreed that radar installations must 
be located on a nation’s periphery— 
which Krasnoyarsk is not. “But it’s hard 
to establish the Soviet’s intention by 
looking at the development of an iso¬ 
lated technology. You have to look at 
combinations of technologies and how 
they are deployed relative to each 
other,” Dalton concluded. 

SDI worries the Soviets largely 
because of its trillion-dollar cost and 
scale, Kuchment claimed. Star Wars is 
also “unsettling” to the Soviets because 
“it touches technologies where the 


Soviets aren’t very strong . . . computer 
technology and certain areas of quan¬ 
tum electronics.” Despite these reserva¬ 
tions, Kuchment said “there are 
indications they are trying to meet the 
challenge.” 

One such indication was the recent 
decision canceling a five-year project to 
reverse overflows that would increase 
central Asia’s water supply. “There was 
no announcement, of course, on where 
that money will go, ’ ’ according to Kuch¬ 


ment. “They canceled it under pressure 
from such terribly influential people as 
Soviet writers, environmentalists, and 
some economists. The [Soviet] govern¬ 
ment doesn’t ordinarily give in so easily 
to the public and the press.” 

Gorbachev also met recently with the 
Soviet Academy of Sciences, Kuchment 
concluded: “No real account was pub¬ 
lished [but Gorbachev] hinted of sweep¬ 
ing reforms” to come. 


GE’s machine vision research 


General Electric researchers have 
undertaken a program designed to 
enhance the image-understanding capa¬ 
bility of computerized machine vision 
systems, enabling such systems to 
“recognize” tanks, guns, and other 
objects they “see.” This winter, DARPA 
funded a two-year, $1 million contract 
to support this work. 

The GE program relates to a larger 
DARPA effort to demonstrate an 
unmanned testbed vehicle called the 
“autonomous land vehicle” (ALV). The 
ALV will help in evaluating unmanned 
vehicles for military operations such as 
surveillance and reconnaissance mis¬ 
sions and shuttling supplies to the front 
lines. GE is one of several groups develop¬ 
ing advanced image-understanding tech¬ 
niques for use in the vehicle’s guidance 
system. This work is targeted at giving 
unmanned vehicles an ability to recog¬ 
nize objects encountered en route, help¬ 
ing the vehicles “decide” what to do in 
different situations; for example, when 
they “see” a hostile tank approaching, 
to turn and run. 

Founded on an advanced AI software 
concept known as geometric reasoning, 
GE’s approach involves teaching com¬ 
puters to recognize objects by matching 
geometric features with those of images 
stored in computer memory. For this to 
work, computers will have to remember 
hundreds of images. Developing methods 
enabling computers to memorize such 
objects quickly—simply by looking at 
photos, drawings, or scale models—is a 
major thrust of the program. 

Researchers are working on a tech¬ 
nique wherein computers are “shown” 
photos, drawings, or mockups from 
several viewpoints (front view, side view, 
and three-quarter view) using a solid 
state TV camera that converts the 
images into digital data. The computer 
takes this data and, employing special 
algorithms, manipulates it to create a 
“wire-frame” three-dimensional model, 


then files the model away in its memory. 
Models constructed in this way will be 
called up from memory during recogni¬ 
tion; that is, when ALV’s solid state 
camera eye spots an unidentified object 
in its path. 

“What the computer does in this exer¬ 
cise is search for a rotation and transla¬ 
tion of the computer model that matches 
the 2D image the camera is seeing,” 
explains principal investigator Joseph L. 
Mundy. “If three or four features agree 
reasonably well, the computer concludes 
that it has probably identified the 
object. It then goes on to more detailed 
tests to remove any doubt. ” 

Researchers have demonstrated the 
basics required for success in the current 
DARPA program. 



A laser stripe lights up prominent features 
on a jeep model, enabling computers to 
establish the jeep’s location in space. 
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CAREER OPPORTUNITIES 


RATES: $12 per line, $120 minimum charge (up 
to ten lines). Average six typeset words per line, 
nine lines per column inch. Add $10 for box num¬ 
ber. Send copy at least six weeks prior to month 
of publication to: Carole Porter, Classified 
Advertising, IEEE COMPUTER Magazine, 10662 
Los Vaqueros Circle, Los Alamitos, CA 90720. 


In order to conform to the Age Discrimination in 
Employment Act and todiscourage age discrim¬ 
ination, IEEE COMPUTER may reject any adver¬ 
tisement containing any of these phrases or 
similar ones: ‘‘...recent college grads...,’’ 
“...1-4 years maximum experience...,” 
“.. .up to 5 years experience..or “.. .10 
years maximum experience.” IEEE COMPUTER 
reserves the right to append to any advertise¬ 
ment, without specific notice to the advertiser, 
“Experience ranges are suggested minimum re¬ 
quirements, not maximums.” IEEE COMPUTER 
assumes that, since advertisers have been 
notified of this policy in advance, they agree 
that any experience requirements, whether 
stated as ranges or otherwise, will be construed 
by the reader as minimum requirements only. 


UNIVERSITY OF NEVADA 
Las Vegas 

The Computer Science and Electrical Engineer¬ 
ing Department invites applications for tenure 
track faculty positions in Computer Science. Ap- 
plicants should have a Ph.D. in Computer 
Science or a related area and a commitment to 
excellence in teaching and research. All fields of 
Computer Science research will be considered 
with preference given to: Architecture, Database 
Management, Operating Systems, and Software 
Engineering. Salary is competitive and commen¬ 
surate with experience. 

The Department offers a B.S. and M.S. degrees in 
Computer Science. Current faculty interests are 
in Theory of Computation, Artificial Intelligence, 
Graphics, Simulation and Modeling, and Soft¬ 
ware Engineering. The Department's Computer 
Science Research Laboratory is presently equip¬ 
ped with DEC minicomputers on an Ethernet. By 
summer it will also include an Intel Hypercube, 
several Sun workstations, a Symbolics work¬ 
station, and a high-performance graphics work¬ 
station. Construction has just begun on a new 
Engineering building funded by the State of 
Nevada. 

Send letters of application, vita, transcripts, and 
names and addresses of four references to: Dr. 
Y.S. Cooper, Computer Science Search Commit¬ 
tee, Dept, of Computer Science and Electrical 
Engineering, University of Nevada, Las Vegas, 
4505 S. Maryland Parkway, Las Vegas, NV 89154. 
Review of applications will begin March 31,1987 
and will continue until positions are filled. The 
University of Nevada, Las Vegas is an Equal Op¬ 
portunity/Affirmative Action Employer. 


NORWICH UNIVERSITY 

Faculty position in Computer Science Engineer¬ 
ing. Tenure-track position, rank and salary com¬ 
mensurate with experience. Ph.D. desired, but 
will consider Masters with experience. Available 
1 July, 1987. Recently implemented Computer 
Science Engineering program. Facilities in¬ 
clude: VAX Cluster, MicroVax supermicrocom¬ 
puters and PC based workstations. New faculty 
will assist in laboratory and curriculum develop¬ 
ment. Expertise in any of the following areas 
desired: Digital Systems Design, CAD/CAE, 
Computer Architecture, Computer Communica¬ 
tions and Networking. A strong commitment to 
undergraduate education is essential. Norwich 
University is located in an area of Central Ver¬ 
mont that offers small town or rural living with 
good schools and outstanding recreational op¬ 
portunities. U.S. Citizen or Permanent Resident 
preferred. Send resume and references to Dr. 
Michael C. Murphy, Head, Computer Science 
Engineering Department, Norwich University, 
Northfield, VT 05663. Norwich University is an 
equal opportunity employer. 


CHAPMAN COLLEGE 

Computer Science/Mathematics. Tenure track 
positions in a private college located 30 miles 
south of Los Angeles. Fall, 1987, appointment. 
Salary commensurate with qualifications and ex¬ 
perience. Teaching undergraduate courses in 
computer science and mathematics. Ph.D. in 
Computer Science or related field, demonstrated 
excellence in teaching, and research potential 
required. Send resume, three letters of refer¬ 
ence, and evidence of teaching ability and 
research potential to Gary Ramet, Computer 
Science and Mathematics, Chapman College, 
Orange, CA 92666. Deadline: May 31,1987. Affir¬ 
mative Action/Equal Opportunity Employer. 


COMPUTER GRAPHICS SOFTWARE SPECIALIST 

Cranston/Csuri Production is currently search¬ 
ing for qualified applicants for a position as 
Computer Graphics Software Specialist. Of par¬ 
ticular interest are individuals with experience in 
the areas of display algorithms, animation 
systems and systems software. 

Cranston/Csuri Productions has a staff of 45 pro¬ 
fessionals including animation and technical 
personnel in addition to computer science 
personnel. Cranston/Csuri utilizes DEC VAX, 
Pyramid, and Sun computer systems, outputting 
from frame buffers to video or film recorders. The 
software development environment consists of 
Unix with C. 

Interested parties should submit a resume, tran¬ 
scripts, and three letters of reference to the ad¬ 
dress given below. Preference will be given to 
those individuals with Master of Science degree 
in Computer Science. 

Doreen P. Close, Director of Software Develop¬ 
ment, Cranston/Csuri Productions, Inc., 1501 
Neil Ave., Columbus, OH 43201. C/CP is an equal 
opportunity employer. 


POST DOCTORAL RESEARCH POSITION 

Electrical Engineering Department, University of 
Southern California. Temporary full time posi¬ 
tion for one to three years commencing immedi¬ 
ately, with possibility of extension contingent 
upon funding. Salary $44,000 to $50,000 per an¬ 
num, dependent on qualifications. Assist in 
development of knowledge based system for 
design of testable digital systems, with pro¬ 
fessors and graduate students. Ph.D. degree or 
equivalent required, along with some research 
experience in VLSI design, testing and design 
for test, and a strong background in A.I. and soft¬ 
ware engineering. Send Resume to Professor 
M.A. Breuer, Electrical Engineering-Systems 
Dept. SAL 324, University of Southern California, 
Los Angeles, CA 90089-0781. USC is an equal op¬ 
portunity, affirmative action employer. 


PROGRAMMER ANALYST 

Programmer Analyst to implement and develop 
programs in CICS and IMS data base environ¬ 
ment. Responsible for design, code, test, and 
documentation of COBOL, PL/I and Fortran pro¬ 
grams on large IBM main frames as well as RPS 
Assembler on IBM Series/1. Must conduct feasi¬ 
bility studies to interface with users regarding: 
systems design requirements. Must perform 
unit and production system tests and supervise 
other analysts. BSCS and 3-4 years programmer 
analyst experience in COBOL or Assembler on 
IBM main frames under OS/VS. Experience must 
include system design of manufacturing proj¬ 
ects. Must know Fortran, PL/I. Job site: Palo Alto. 
40 hour week. $32,500 year. Send ad and resume: 
MLU #2706, P.O. Box 9560, Sacramento, CA 
95823-0560 not later than May 29, 1987. 


UNIVERSITY OF TEXAS 
Health Science Center 

The University of Texas Health Science Center at 
Dallas, an equal opportunity employer, is seek¬ 
ing a Director of Academic Computing Services. 
Responsibilities include the management of 
staff and hardware resources of the unit, advice 
and consultation regarding computing facilities 
and statistical computing analysis, and for¬ 
mulating educational programs to support the 
scientific and academic computing require¬ 
ments of the faculty, fellows, and students. The 
requirements for this position are an advanced 
degree (M.A. or M.S.—minimum and a Ph.D. is 
preferred) in computing or information science 
with appropriate work experience and demon¬ 
strated abilities for management. Supporting the 
academic computing needs of the Health 
Science Center is of paramount importance. The 
Director reports directly to the Dean. The posi¬ 
tion has excellent opportunity for growth in the 
administrative operations of the institution. 
Reply to: P. O’B. MONTGOMERY JR., M.D., 
Department of Pathology, University of Texas 
Health Science Center at Dallas, 5323 Harry 
Hines Blvd., Dallas, TX 75235-9072. Equal Oppor¬ 
tunity Employer. 
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THE HARTFORD GRADUATE CENTER 
AFFILIATED WITH 

RENSSELAER POLYTECHNIC INSTITUTE 
Department Chair—Computer Science 

The Hartford Graduate Center is an independent 
institution accredited to award Master's degrees 
from Rensselaer Polytechnic Institute of Troy, 
New York as well as in its own right. The 
Graduate Center invites applications from ex¬ 
perienced persons for the position of Chair¬ 
person, Department of Computer Science, for 
the Fall 1987 semester. 

Research interests within the Department in¬ 
clude Artificial Intelligence, Human-Computer 
Interaction, Software Engineering, Operating 
Systems, Computer Engineering and Database. 
The Department maintains a superb workstation- 
based computing facility (Sun 3 and Apollo) to 
support these interests and the teaching program. 
A Ph.D. in Computer Science or a related area is 
required. Preference will be given to those appli¬ 
cants with a background in Computer Communi¬ 
cations Networks, Software Engineering, or Arti¬ 
ficial Intelligence. Closing date for applications 
is May 30,1987, and the projected starting date 
for the appointment is for the ’87-’88 academic 

Direct vita, along with three references to: 
Michael M. Danchak, Dean, Engineering and 
Science, The Hartford Graduate Center, 275 
Windsor St., Hartford, CT 06120. The Hartford 
Graduate Center is an Equal Opportunity/Affir¬ 
mative Action Employer. 


CONTROLLER/PROGRAMMER 

Business/Engineering Systems Controller/Pro¬ 
grammer for integrated Financial, Engineering 
and Production reporting, Accounting and Bill¬ 
ing Systems for engineering consulting firm in El 
Toro, California (also interview site). 

Convert financial accounting data to detailed 
logic flow charts for computer language coding 
on a main frame computer. Confers with branches 
of company about input/output data/checks/ 
controls. Trains production personnel in use of 
system; analyzes reports; tests programs; cor¬ 
rects errors; prepares instruction booklets and 
trains computer program operators. Overall 
responsibility for converting WIP to billings. 
Three (3) years experience in accounting and one 
(1) year training in computer systems. Forty (40) 
hrs/week; $2,667.00 per month. Send resume to 
Job #MLU 1417, P.O. Box 9560, Sacramento, CA 
95823-0560, not later than June 8, 1987. 


THE HARTFORD GRADUATE CENTER 
AFFILIATED WTIH 

RENSSELAER POLYTECHNIC INSTITUTE 
Computer Science Faculty 

The Hartford Graduate Center is an independent 
institution accredited to award Master’s degrees 
from Rensselaer Polytechnic Institute of Troy, 
New York as well as in its own right. The Gradu¬ 
ate Center invites applications for faculty posi¬ 
tions, at all levels, in the Department of Com¬ 
puter Science, for the Fall 1987 semester. 
Research interests within the Department in¬ 
clude Artificial Intelligence, Human-Computer 
Interaction, Software Engineering, Operating 
Systems, Computer Engineering and Database. 
The Department maintains a superb workstation- 
based computing facility (Sun 3 and Apollo) to 
support these interests and the teaching program. 
A Ph.D. in Computer Science or a related area is 
required. Preference will be given to those appli¬ 
cants with a background in Computer Communi¬ 
cations Networks, Software Engineering, or Arti¬ 
ficial Intelligence. Closing date for applications 
is May 30, 1987, and the projected starting date 
for the appointment is for the ’87-’88 academic 

Direct vita, along with three references to: 
Michael M. Danchak, Dean, Engineering and 
Science, The Hartford Graduate Center, 275 
Windsor St., Hartford, CT 06120. The Hartford 
Graduate Center is an Equal Opportunity/Affir¬ 
mative Action Employer. 


NEW JERSEY INSTITUTE OF TECHNOLOGY 
Electrical Engineering 

N JIT invites applications for tenure-track faculty 
positions at all levels, with a specialty in Com¬ 
puter Engineering. 

Candidates must hold a doctoral degree in elec¬ 
trical or computer engineering and have a strong 
commitment to teaching. Teaching responsi¬ 
bilities will include both undergraduate and 
graduate level courses. Consideration for ap¬ 
pointment at the rank of asst, professor requires 
demonstrated ability to conduct significant 
research. Appointments at higher ranks require 
extensive research experience and demon¬ 
strated leadership in areas of specialization. 
Positions are effective 9/1/87 and will remain 
open until filled. 

NJIT does not discriminate on the basis of sex, 
race, color, handicap, national or ethnic origin, or 
age in employment. 

Send resume to: 

Personnel Box EE-CE, NEW JERSEY INSTITUTE 
OF TECHNOLOGY, Newark, NJ 07102. 


MICHIGAN STATE UNIVERSITY 

The Department of Computer Science invites ap¬ 
plications for tenure track positions at all levels. 
Candidates from all areas of specialization in 
computer science or computer engineering will 
be considered. The department has a special in¬ 
terest in candidates in the areas of programming 
languages, database systems, artificial intelli¬ 
gence and expert systems, robotics, design of 
computer systems and networks, parallel com¬ 
putation, dataflow machines, operating systems 
and computational complexity. Candidates 
should have a Ph.D. in computer science or com¬ 
puter engineering and have a strong interest in 
both research and teaching. Applications will be 
accepted until the positions are filled. 

As a unit within the College of Engineering at 
Michigan State University, Computer Science of¬ 
fers the Bachelor of Science, Master of Science 
and Doctor of Philosophy degrees. Special sup¬ 
port is available from within the college and 
university to initiate research by new faculty 
members. Faculty offices are connected to the 
MSUnet which provides access to an array of 
campus computing resources including the 
facilities of the College of Engineering, the 
Department’s VAX 8600, and the Department's 
Distributed Computing Research Laboratory 
with SUN-2 and SUN-3 workstations. Access to 
select off-campus computers is available, as 
well as access to ARPANET, CSNET, BITNET, 
and MAILNET. Additional facilities available to 
faculty are the Department’s Pattern Recogni¬ 
tion and Image Processing Laboratory and the 
College's A.H. Case Center for Computer-Aided 

Michigan State University enjoys a park-like 
campus of 2,100 developed acres and 3,100 
acres of experimental farms, outlying research 
facilities and natural areas. The campus is adja¬ 
cent to the cities of East Lansing and the capital 
city, Lansing. The Greater Lansing area has ap¬ 
proximately 250,000 residents. The communities 
have fine school systems and place a high value 
on education. 

Applicants should send a resume and a state¬ 
ment of research and teaching interests to: Dr. 
Anthony S. Wojcik, Chairperson, Department of 
Computer Science, A714 Wells Hall, Michigan 
State University, East Lansing, Ml 48824-1027, 
CSNET: wojcik@mich-state.csnet. Michigan 
State University is an Equal Opportunity/Affir¬ 
mative Action Institution and encourages ap¬ 
plications from members of ethnic minority 
groups. 


NAVAL POSTGRADUATE SCHOOL, 
Monterey, CA 

Position Announcement in Computer Science 

The Department of Computer Science has im¬ 
mediate openings for faculty positions at all 
levels in the areas of Artificial Intelligence and 
Computer Architecture. An applicant should 
have a PhD in Computer Science or a related 
field and have a strong interest in both graduate 
teaching and research. Senior applicants must 
have distinguished research records. Appoint¬ 
ments can begin at any time during the year. 
The Department offers MS and PhD degrees in 
Computer Science supported by well-equipped 
instructional/research facilities and a full-time 
technical staff. The faculty normally teach for 
two quarters and conduct full-time research sup¬ 
ported by major research programs during the 
other two quarters. 

Please send a detailed resume and three letters 
of reference to: Professor Vincent Lum, Chair, 
Computer Science Department (Code 52), Naval 
Postgraduate School, Monterey, CA 93943, 
Telephone (408) 646-2449. N PS IS AN EQUAL OP¬ 
PORTUNITY/AFFIRMATIVE ACTION EMPLOYER. 


Moving? 

PLEASE NOTIFY 

Name (Please Print) 

US 4 WEEKS 

IN ADVANCE 

New Address 


City State/Country Zip 

MAIL TO: 

IEEE Service Center 

445 Hoes Lane 

• This notice of address change will apply to all 
ATTACH IEEE publications to which you subscribe. 

LABEL • List new address above. 

Piscataway, NJ 08854 

HERE . |f you have a question about your subscription, 

place label here and clip this form to your letter. 
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BILKENT UNIVERSITY 


BILKENT UNIVERSITY, ANKARA/TURKEY 
Department of Computer Engineering and Infor¬ 
mation Sciences and Electrical and Electronics 
Engineering Departments invite applications for 
several tenure track and visiting faculty posi¬ 
tions at all levels in all areas. These Departments 
offer programs leading to B.S., M.S. and Ph.D. 
degrees. In addition to numerous micros, the 
departments have SUN workstations and a 
supermini computer. These departments are 
under constant expansion in terms of their sizes 
and facilities. Applicants are expected to have a 
Ph.D. in Computer Science/Engineering or Elec¬ 
trical and Electronics Engineering and a strong 
commitment to both research and teaching. The 
search remains open until positions are filled. 
Please forward resume and 3 references to: 
Dean, Faculty of Engineering, P.O. Box 8, 06572 
Maltepe, ANKARA/TURKEY. Tel: (90-41) 664126. 


RICE UNIVERSITY 

Rice University School of Architecture is seek¬ 
ing applicants for a position in the teaching of 
Computed Aided Design and Computer Graphics. 
The individual will be responsible for: teaching 
an introduction to Computer Graphics to majors 
and non-majors; assisting students with com¬ 
puter aided design at the micro station and main 
frame scale; and, developing a research program 
in computer aided design. Applicants should 
have an appropriate degree at the Masters level 
and contact the school no later than March 31, 
1987. Rice University is an equal opportunity/af¬ 
firmative action employer. 

Send inquiries to: Professor Anderson, Head, 
Faculty Search Committee, c/o Doris Anderson, 
School of Architecture, Rice University, P.O. Box 
1892, Houston, TX 77251. 

SUNY PLATTSBURGH COMPUTER SCIENCE 

Tenured Track or visiting positions in Computer 
Science, effective September 1,1987. All special¬ 
ities considered, with preference for persons in 
Systems Programming and Applied Computer 
Science. Plattsburgh State has a strong under¬ 
graduate Computer Science major. Responses 
particularly solicited from persons with relevant 
advanced degrees and research-oriented ex¬ 
perience in government or industry. Salary 
negotiable. Applications will be accepted until 
positions are filled. Send letter of application 
and three letters of recommendation to: Pro¬ 
fessor Julius A. Archibald, Jr., c/o Office of Per¬ 
sonnel/Affirmative Action, Box 1001, SUNY Col¬ 
lege of Arts and Science, Plattsburgh, NY 
12901-2699. An Equal Opportunity/Affirmative 
Action Employer. 


FLORIDA ATLANTIC UNIVERSITY 

Tenure track senior and junior faculty positions 
in Computer Engineering are available for 
August or January. Strong teaching and research 
interests in specialities such as operating 
systems, computer networks, VLSI, computer ar¬ 
chitecture, fault tolerant systems, parallel pro¬ 
cessing or computer vision are desired. Florida 
Atlantic University, part o* the State University 
System, is located on Florida’s southeast coast 
among a rapidly growing high tech community. 
Applicants should have an earned doctorate in 
electrical or computer engineering or other 
closely related field. Administrative oppor¬ 
tunities are possible for senior applicants. Send 
resume to Roger A. Messenger, Ph.D., P. E., 
Department of Electrical and Computer Engi¬ 
neering, Florida Atlantic University, Boca Raton, 
FL 33431. An equal opportunity/affirmative ac¬ 
tion employer. 


CALL FOR PAPERS 


Call for papers for Computer 


Computer magazine seeks articles that 
cover the state of the art and important 
new developments in computer science, 
technology, and applications. Aimed at a 
broad audience with diverse interests and 
experience, Computer usually publishes 
surveys or tutorials that facilitate the 
transfer of technology from university to 
industry, from research to applications, 
and across specialized fields. Submit six 
copies of the manuscript, including illus¬ 
trations, references, and authors’ biogra¬ 
phies, to 


Bruce Shriver 

IBM T.J. Watson Research Center 
Route 134 

PO Box 704, HO-B04A 
Yorktown Heights, NY 10598 
(914) 789-7626 
Compmail+: b.shriver 
CSnet: shriver@ibm.com 
Vnet: shriver at yktvmh 


Computer also seeks contributions for: 


special issue that will be published in 
December. The focus of this issue will be 
on integrated optical devices and architec¬ 
tures for optical computing as opposed to 
two-dimensional bulk optical technolo¬ 
gies. Authors should send six copies of 
their manuscripts to one of the guest edi¬ 
tors, and persons interested in serving as 
referees should send their credentials. All 
materials must be submitted by June 30, 
1987, to 


T.E. Batchman 

Dept, of Electrical Engineering 
Charlottesville, VA 22901 
(804) 924-6082 


Edward A. Parrish, Jr. 
School of Engineering 
Vanderbilt University 
Nashville, TN 37235 
(615) 322-2762 


Computer Society of the IEEE Technical 
Committee on Computer Education: 
Contributions up to five typewritten pages 
are welcomed for the TCCE newsletter, a 
forum for the exchange of ideas among per¬ 
sons interested in computer education or 
computers in education. Direct news items, 
short articles, and any correspondence to 
Helen Hays, Dept, of Computer Science, 
Southeast Missouri State University, Cape 
Girardeau, MO 63701; (314) 651-2244. 


presenting a one-day tutorial, should submit 
a proposal. Suggestions for topics, sessions, 
or tutorials are welcome. The earliest submis¬ 
sions have the most influence. Submit 
materials by June 1, 1987, to Hasan al-Khatib, 
Dept, of Electrical Engineering and Com¬ 
puter Science, University of Santa Clara, 
Santa Clara, CA 95053; (408) 554-4485. (See 
ad on page 74 of the March issue of 
Computer.) 


^ Compcon Spring 88: February 29-March 
" 4, 1988, San Francisco. This conference 
provides a broad-based technical update of 
the leading developments in the computing 
field. Persons interested in making technical 
presentations should submit four copies of 
in extended abstract or a complete paper. 
Persons interested in organizing a three- 
speaker, 90-minute session or panel, or in 


IEEE Software: Materials on CASE 
^*7 tools, workstation software, and scien¬ 
tific systems are sought for the January 1988 
issue. Contact Ted Lewis, editor-in-chief, 
IEEE Software, c/o Computer Science Dept., 
Oregon State University, Corvallis, OR 
97331; (503) 754-3273; CSnet, lewis@oregon- 
state; CompmailF, t.lewis. Materials are due 
by June 1, 1987. 


Conferences that the Computer Society participates in or sponsors are 
*87 indicated by the Computer Society logo; additional conference spon¬ 
sors are listed in parentheses. Other conferences of interest to our readers 
are also included. 

For inclusion in Call for Papers or Calendar, submit information six weeks 
before the month of publication (e.g., for the August 1987 issue, send informa¬ 
tion for receipt by June 15,1987) to Calendar Editor, Computer, 10662 Los 
Vaqueros Circle, Los Alamitos, CA 90720. 
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IEEE Thmsactions on Computers: 
v37 Papers are sought for a special issue on 
fault-tolerant computing. Guidelines for sub¬ 
mitting manuscripts appear in every issue of 
IEEE Transactions on Computers. Submit six 
copies of paper by June 1, 1987, to Gerald M. 
Masson, Dept, of Computer Science, Johns 
Hopkins University, Baltimore, MD 21218; 
(301) 338-7013. 

Micro 20, 20th Annual Workshop on Micropro¬ 
gramming (ACM): December 1-4, 1987, 
Colorado Springs, Colorado. Submit one 
copy of the complete paper by June 1, 1987, 
to Gearold R. Johnson, Center for Computer- 
Assisted Engineering, Colorado State Univer¬ 
sity, Fort Collins, CO 80523; (303) 491-5543. 

Multi 88, 1988 SCS Multiconference: Modeling 
and Simulation on Microcomputers, Conference 
on Power Plant Simulation, Aerospace Simula¬ 
tion, Distributed Simulation, and Simulation 
and Artificial Intelligence: February 3-5, 1988, 
San Diego, California. Submit proposals for 
papers (proposals not to exceed 300 words) 
and proposals for special sessions by June 1, 
1987, to SCS, PO Box 17900, San Diego, CA 
92117; (619) 277-3888. 


Science Dept., University of Toronto, Toronto CSnet, lewis@oregon-state; Compmail+, 
M5S 1A4, Ontario, Canada; phone (416) t.lewis. Materials are due by August 1,1987. 

978-6219. 


Seventh Annual IEEE Phoenix Confer- 
ence on Computers and Communications: 

March 16-18, 1988, Scottsdale, Arizona. Sub¬ 
mit five copies of a complete paper (5000 
words maximum) that includes an abstract of 
200 to 300 words by July 3, 1987, to Alex C. 
Brown, Jr., Goodyear Aerospace Corp., MS 
1212, PO Box 85, Litchfield Park, AZ 85340; 
(602) 925-7509. Persons interested in submit¬ 
ting proposals for panel sessions should con¬ 
tact Alex Brown. Persons interested in 
offering tutorials should contact Forouzan 
Golshani, Dept, of Computer Science, Ari¬ 
zona State University, Tempe, AZ 85287; 

(602) 965-2855. 


Second Workshop on Artificial Intelligence 
Applications in Environmental Sciences 
(NOAA/ERL): September 15-17, 1987, Boul¬ 
der, Colorado. Submit an abstract (300 words 
maximum) by July 15, 1987, to William 
Moninger, NOAA/ERL, R/E2, 325 Broad¬ 
way, Boulder, CO 80303. 


Information Sciences—An International Jour¬ 
nal: Papers are sought for a special issue to 
be published in early 1988 on database sys¬ 
tems. Submit four copies of papers by August 
1, 1987, to Ahmed Elmagarmid, Computer 
Engineering Program, 121 Electrical Engi¬ 
neering East Bldg., Pennsylvania State Uni¬ 
versity, University Park, PA 16802; (814) 
863-1047. 


® IEEE Infocom 88: Networks—Evolution 
or Revolution?: March 28-31, 1988, New 
Orleans. Submit four copies of the complete 
paper by August 3, 1987, to A1 Leon-Garcia, 
Dept, of Electrical Engineering, University of 
Toronto, Toronto, Ontario M5S 1A4, 

Canada; phone (416) 978-5037. 


Control 88 (IEE): April 13-15, 1988, Oxford, 
England. Submit an extended synopsis (1500 
words maximum) by August 28, 1987, to Con¬ 
ference Services, IEE, Savoy PL, London 
WC2R 0BL, England, UK; phone 44 (01) 
240-1871. 


NARDAC, Washington/ORNL/DRSD Confer¬ 
ence on Expert Systems Technology in the ADP 
Environment (Naval Regional Data Automation 
Center—Washington, Oak Ridge National 
Laboratory): November 2-3, 1987, Washing¬ 
ton, DC. Submit papers (seven pages maxi¬ 
mum) by June 1,1987, to Lloyd F. Arrowood, 
Oak Ridge National Laboratory, 4500-North, 
MS 207, Oak Ridge, TN 37831; (615) 
576-8272. 


21st Annual Asilomar Conference on Signals, 
Systems, and Computers (IEEE, Naval Post¬ 
graduate School): November 2-4, 1987, Pacific 
Grove, California. Submit three copies of 
both an abstract 50 to 100 words long and an 
extensive summary by June 1, 1987, to 
Douglas F. Elliott, Rockwell International 
Corp., 3370 Miraloma Ave., MS BD07, Ana¬ 
heim, CA 92803-3170. 


® Fourth International Conference on Data 
Engineering: February 2-4, 1988, Los 
Angeles. Both papers and proposals for 
tutorials are sought. Submit four copies of 
completed paper (approximately 5000 words) 
by June 15, 1987, to John V. Carlis, Computer 
Science Dept., University of Minnesota, 207 
Church St. SE, Minneapolis, MN 55455; 

(612) 625-6092. Submit proposals for 
tutorials by the same date to Amit P. Sheth, 
Honeywell Computer Sciences Center, 1000 
Boone Ave. North, Golden Valley, MN 
55427; (612) 541-6899. 


1988 ACM SIGMetrics Conference on Measure¬ 
ment and Modeling of Computer Systems: May 
24-27, 1988, Santa Fe, New Mexico. Abstracts 
of papers are due in July 1987, and six copies 
of individual papers will be due by September 
30,1987. Extended abstracts for poster ses¬ 
sions are due October 30, 1987. Proposals for 
tutorial sessions are also sought. Submit 
materials to Kenneth C. Sevcik, Computer 


25th Annual Allerton Conference on Commu¬ 
nication, Control, and Computing: September 
30-October 2, 1987, Monticello, Illinois. Sub¬ 
mit the title and a 1000-word summary of a 
paper suitable for a 20-minute presentation, 
or the title and a 500-word summary of a 
paper suitable for a 10-minute presentation 
by July 27, 1987, to P.R. Kumar, University of 
Illinois at Urbana-Champaign, Coordinated 
Science Laboratory, 1101 W. Springfield Ave., 
Urbana, IL 61801. 

ZZX Workshop on Workstation Operating Sys- 
terns: November 5-6, 1987, Cambridge, 
Massachusetts. Potential participants should 
submit 10 copies of a position statement 
describing their experiences, interests, and 
future directions that are related to the work¬ 
shop topic by July 27, 1987, to Luis Felipe 
Cabrera, IBM Almaden Research Center, 650 
Harry Rd., San Jose, CA 95120-6099; (408) 
927-1838. 


Second IEEE Conference on Computer 
^57 Workstations: March 7-10, 1988, Santa 
Clara, California. Submit six copies of the 
paper (5000 words maximum) by July 29, 
1987, to Lawrence Stewart, DEC Systems 
Research Center, 130 Lytton Ave., Palo Alto, 
CA 94301. 


Third Annual Conference on Artificial Intelli¬ 
gence for Space Applications (NASA): Novem¬ 
ber 2-3, 1987, Huntsville, Alabama. Submit 
unclassified abstracts not exceeding 400 
words by July 31, 1987, to Thomas S. 
Dollman, NASA/EB44, Marshall Space 
Flight Center, AL 35812; (205) 544-3823. 


Z2S. IEEE Software: Materials on fourth- 
K*7 generation language development, soft¬ 
ware legal aspects, and software modeling are 
sought for the March 1988 issue. Contact Ted 
Lewis, editor-in-chief, IEEE Software, c/o 
Computer Science Dept., Oregon State Uni¬ 
versity, Corvallis, OR 97331; (503) 754-3273; 


IEEE International Conference on Robotics 
and Automation: April 25-29, 1988, Philadel¬ 
phia. Papers 15 to 20 pages long are sought, 
as are papers five to seven pages long. Submit 
papers in either category by September 15, 
1987, to Robert B. Kelley, ECSE Dept., Rens¬ 
selaer Polytechnic Institute, Th>y, NY 
12180-3590. 


Fourth International Conference on Pattern 
Recognition (BPRA, IAPR): March 28-30, 
1988, Cambridge, England. Submit three 
copies of the complete paper (4000 words 
maximum) by September 30, 1987, to J. Kit- 
tler. Dept, of Electronic and Electrical Engi¬ 
neering, University of Surrey, Guildford GU2 
5XH, England, UK. 


14th International Conference on Electric Con¬ 
tacts (IEEE): June 20-24, 1988, Paris. Submit 
an abstract (200 words maximum) by October 
1, 1987, to S.E.E., 48 Rue de la Procession, 
75724 Paris Cedex 15, France. 


1988 International Computers in Engineering 
Conference: Real-World Applications of Expert 
Systems and Artificial Intelligence (ASME): 

August 7-11, 1988, San Francisco. Submit 
abstracts by October 15, 1987, to Edward M. 
Patton, US Army Ballistic Research Lab, 
Aberdeen Proving Grounds, MD 21005. 

IEEE Transactions on Computers: 

^*7 Papers are sought for a special issue on 
architectural support for programming lan¬ 
guages and operating systems. Guidelines for 
submitting manuscripts appear in every issue 
of IEEE Transactions on Computers. Send 
seven copies of the manuscript by November 
1, 1987, to Randy Katz, Dept, of Electrical 
Engineering and Computer Science, Com¬ 
puter Science Division, Evans Hall, Univer¬ 
sity of California, Berkeley, CA 94720; (415) 
642-8778. 
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CALENDAR 


May 1987 


West Coast Testing Workshop, May 
17-20, Lake Tahoe, Nevada. Contact 
Don Nelson, 1575 Garden of the Gods, 
Colorado Springs, CO 80907. 

® Eighth Symposium on Computer Arith¬ 
metic (Computer Society, A El, 

Euromicro, IFIP), May 19-21, Como, Italy. 
Contact Mary Jane Irwin, Dept, of Com¬ 
puter Science, 333 Whitmore Laboratory, 
Pennsylvania State University, University 
Park, PA 16802 or Luigi Dadda, Dept, of 
Electronics, Piazza Leonardo da Vinci 32, 
Politecnico di Milano, 1-20133 Milan, Italy. 

/£3^> CG International 87, Fifth International 
vS7 Conference on Computer Graphics in 
Japan (Computer Society, ACM, CGS), May 
25-28, Karuizawa, Nagano Prefecture, Japan. 
Contact Tosiyasu L. Kunii, Dept, of Informa¬ 
tion Science, Faculty of Science, University 
of Tokyo, Hongo, Tokyo 113, Japan; phone 
81 (03) 812-2111, ext. 4116. 

£3^, 1987 International Symposium on 

Multiple-Valued Logic, May 26-28, Bos¬ 
ton. Contact Dan Simovici, Dept, of 
Mathematics and Computer Science, Univer¬ 
sity of Massachusetts, Boston, MA 02125; 
(617) 929-7966. 

CASE-87, First International Workshop on 
Computer-Aided Software Engineering, May 
27-29, Cambridge, Massachusetts. Contact 
Elliot Chikofsky, Index Technology, 1 Main 
St., Cambridge, MA 02142; (617) 491-2100, 


Z3^v Spring 87 Workshop of the Technical 
^57 Committee on Packaging, May 27-29, 

Solvang, California. Contact D. Elaine Pope, 
Intel Corp., (SP1-30) 145 S. 79th St., Chan¬ 
dler, AZ 85226; (602) 961-5368. 


June 1987 

Z3^, IFIP Workshop on CAD Engines (Com- 
T57 puter Society, IPSJ, Institute of Elect./Com. 
Eng. of Japan), June 1-2, Kakaishinlou, 
Kainkan, Japan. Contact Tatsuo Ohtuski, 
Waseda University, Dept, of Elec, and 
Comm., School of Science and Engineering, 
3-4-2 Okubo, Shinjuku, Tokyo 160, Japan; 
phone 81 (03) 209-3211. 

® Westex 87, Second Western Expert Sys¬ 
tems Conference, June 2-4, Anaheim, 
California. Contact Gene Dippel, PO Box 
2111, Fullerton, CA 92633-0111; (714) 
733-3921. 


14th International Symposium on Com- 
'viy puter Architecture, June 2-5, Pittsburgh. 
Contact Zary Segall, Computer Science 
Dept., Carnegie Mellon University, Pitts¬ 
burgh, PA 15213; (412) 268-3736. 

ICC-87, IEEE International Conference on 
Communications, June 7-10, Seattle. Contact 
Robert W. Tiplin, Jr., Pacific Northwest Bell, 
1600 Bell Plaza, Seattle, WA 98191; (206) 
346-8679. 

® ICCV-87, First International Conference 
on Computer Vision, June 8-11, London. 
Contact Azriel Rosenfeld, University of 
Maryland, Center for Automation Research, 
College Park, MD 20742; (301) 454-4526. 

Usenix 1987 Summer Conference, June 8-12, 

Phoenix, Arizona. Contact Judith F. 
DesHarnais, Usenix Conference Office, PO 
Box 385, Sunset Beach, CA 90742; (213) 
592-3243. 

IEEE ICC-87 Workshop on the Integration of 
Expert Systems Into Network Operations, June 
11-12, Seattle. Contact Nassem A. Kahn, 

AT&T Bell Laboratories, HR 1 J-220, 480 Red 
Hill Rd„ Middletown, NJ 07724; (201) 
758-5590 or Rao V. Mikkilineni, Bell Com¬ 
munications Research, NVC 2Z-155, 331 
Newman Springs Rd., Red Bank, NJ 07701; 
(201) 578-5590. 

Z3^, Second IEEE Workshop on Software 
Technology Transfer, June 11-13, Santa 
Fe, New Mexico. Contact Charles Richter, 
MCC, Software Technology Program, PO 
Box 200195, Austin, TX 78720. 

NCC-87, National Computer Conference 
(Computer Society, AFIPS, ACM, 
DPMA, SCS), June 15-18, Chicago. Contact 
AFIPS, 1899 Preston White Dr., Reston, VA 
22091; (703) 620-8900. 

|£3^| Second Annual Conference on Structure 
v37 in Complexity Theory (Computer Society, 
ACM), June 16-19, Ithaca, New York. Contact 
Alan Selman, College of Computer Science, 
161 Cullinane Hall, 360 Huntington Ave., 
Boston, MA 02215; (617) 437-8688. 

Z3^> Second Annual Symposium on Logic in 
Computer Science (Computer Society, 
ASL, EATCS), June 22-25, Ithaca, New York. 
Contact Ashok K. Chandra, IBM T.J. Wat¬ 
son Research Center, PO Box 218, Yorktown 
Heights, NY 10598; (914) 945-1752. 

NATO Advanced Study Institute on Testing and 
Diagnosis of VLSI and ULSI, June 22-July 4, 

Como, Italy. Contact F. Lombardi, Univer¬ 
sity of Colorado at Boulder, Campus Box 
425, Boulder, CO 80309; (303) 492-1437 or 
M. Sami, Dept, di Elettronica, Politecnico di 
Milano, Piazza Leonardo da Vinci 32, 

1-20133 Milan, Italy; phone 39 (2) 2399-3516. 


/£S^v Second International Conference on 

Computers and Applications, June 23-27, 

Beijing. Contact Tse-yun Feng, Penn State 
University, Electrical Engineering East Bldg., 
University Park, PA 16802 or Oscar N. Gar¬ 
cia, Dept, of Electrical Engineering and 
Computer Science, Rm. T-637, George Wash¬ 
ington University, 801 22nd St. NW, Wash¬ 
ington, DC 20052; (202) 676-7175. 

NECC-87, Eighth National Educational Com¬ 
puting Conference (ACM, SCS), June 24-26, 
Philadelphia. Contact Frank L. Friedman, 
Computer Activities Bldg., Box JA1, Dept, of 
Computer and Information Sciences, Temple 
University, Philadelphia, PA 19122; (215) 
787-8450. 

^3^\ SIGPlan 87, Symposium on Interpreters 
and Interpretive Techniques (Computer 
Society, ACM), June 24-26, St. Paul, Min¬ 
nesota. Contact Mark Scott Johnson, 
Hewlett-Packard Labs, 1501 Page Mill Rd., 
3024, Palo Alto, CA 94304-1181. 

DAC-87, 24th ACM/IEEE Design Auto- 
^*7 mation Conference, June 28-July 1, 

Miami Beach, Florida. Contact Design Auto¬ 
mation Conference, P.O. Pistilli, MP Associ¬ 
ates, 7366 Old Mill Trail, Boulder, CO 80301; 
(303) 530-4562. 

Compass 87, Conference on Computer Assur¬ 
ance (IEEE), June 29-July 3, Washington, 

DC. Contact Frank Houston, PO Box 5314, 
Rockville, MD 20851; (301) 443-5020. 


July 1987 


Z3^v CAR-87, Second International Conference 
on Computer-Assisted Radiology (Com¬ 
puter Society, ACM), July 1-4, Berlin. Contact 
Michael L. Rhodes, Multi-Planar Diagnostic 
Imaging, Inc., 2730 Pacific Coast Hwy., Tor¬ 
rance, CA 90505; (213) 539-5944. 

© Workshop on Real-Time Operating Sys¬ 
tems, July 2-3, Cambridge, Mas¬ 
sachusetts. Contact Krithi Ramamritham, 
Dept, of Computer and Information Science, 
Graduate Research Center, University of 
Massachusetts, Amherst, MA 01003; (413) 
535-0196 or A1 Mok, Dept, of Computer 
Science, University of Texas, Austin, TX 
78713; (512) 471-9542. 

43^1 FTCS-17, 17th International Symposium 
on Fault-Tolerant Computing, July 6-8, 

Pittsburgh. Contact John Shen, Carnegie 
Mellon University, Dept, of Electrical and 
Computer Engineering, Pittsburgh, PA 
15213; (412) 268-3601. 


May 1987 
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/g^v MCC-University Research Symposium, 
v*7 July 14-15, Austin, Texas. Contact Irwin 
P. Graham, MCC, 3500 W. Balcones Dr., 
Austin, TX 78759; (512) 338-8227. 

® ACM SIGGraph 87, July 27-31, Ana¬ 
heim, California. Contact SIGGraph 
87 Conference Management, Smith Bucklin 
and Associates, Inc., Ill E. Wacker Dr., Suite 
600, Chicago, IL 60601; (312) 644-6610. 


August 1987 


Seventh International Conference on Computer 
Science (IEEE), August 3-7, Santiago, Chile. 
Contact Hector Garcia-Molina, Dept, of 
Computer Science, Princeton University, 
Princeton, NJ 08544. 


jgii IEEE/ACM Symposium on the Simula- 
757 tion of Computer Networks (Computer 
Society, ACM, SCS), August 5-7, Colorado 
Springs, Colorado. Contact Mitchell Spiegel, 
GTE Systems, 1700 Research Blvd., Rock¬ 
ville, MD 20850; (301) 294-8400. 


Ninth International Conference on 
7*7 Production Research (Computer Society, 
IFPR), August 17-20, Cincinnati, Ohio. Con¬ 
tact Ernest L. Hall, Center for Robotics 
Research, University of Cincinnati, ML 72, 
Cincinnati, OH 45221. 


1987 International Conference on Parallel 
Processing, August 17-21, St. Charles, Illinois. 
Contact T. Feng, Pennsylvania State Univer¬ 
sity, Dept, of Electrical Engineering, Univer¬ 
sity Park, PA 16802; (814) 863-1469. 


,g^ 1987 Workshop on Visual Languages, 
7*7 August 19-21, Linkoping, Sweden. Con¬ 
tact Erland Jungert, FFV Elektronik AB, 
Agatan 29, S-582 22 Linkoping, Sweden, or 
R. Korfhage, Dept, of Information Science, 
University of Pittsburgh, Pittsburgh, PA 
15260; (412) 624-5206. 

International Workshop on Petri Nets and 
7*7 Performance Models (Computer Society, 
ACM), August 24-26, Madison, Wisconsin. 
Contact Tadao Murata, University of Illinois 
at Chicago, Dept, of Electrical Engineering 
and Computer Science, MC 154, Box 4348, 
Chicago, IL 60680; (312) 996-2307 or (312) 
996-3422. 

Eurographics 87 (IFIP), August 24-28, Amster¬ 
dam. Contact Secretariat Eurographics 87, 
c/o Organisatie Bureau Amsterdam, 
Europaplein 12, 1078 GZ Amsterdam, The 
Netherlands; phone 31 (20) 44-08-07. 

/g^v 1987 Annual International Test Confer- 
7*7 ence, August 30-September 4, Washing¬ 
ton, DC. Contact Doris Thomas, PO Box 
264, Mount Freedom, NJ 07970; (201) 
895-5260. 


September 1987 


tecture (ACM, IFIP, INRIA), September 14-16, 

Portland, Oregon. Contact John H. Wil¬ 
liams, IBM Almaden Research Center, 
K53/803, 650 Harry Rd., San Jose, CA 
95120-6099. 

Midcon 87 (IEEE), September 15-17, Rosemont, 
Illinois. Contact Alexes Razevich, Electronic 
Conventions Management, 8110 Airport 
Blvd., Los Angeles, CA 90045; (213) 772-2965 
or (800) 421-6816. 

26th Lake Arrowhead Workshop: 

7*7 Specifying Concurrent Systems in the 
Year 2000, September 16-18, Lake Arrowhead, 
California. Contact Brent Hailpern, IBM T.J. 
Watson Research Center, PO Box 704, York- 
town Heights, NY 10598; (914) 789-7797; 
CSnet bth@ibm.com. 

gN CSM-87, Conference on Software Main- 
757 tenance (Computer Society, AWC, 
DPMA, NBS, SMA), September 21-24, Aus¬ 
tin, Texas. Contact Roger Martin, National 
Bureau of Standards, Bldg. 225, Rm. B266, 
Gaithersburg, MD 20899; (301) 921-3545 or 
Computer Society of the IEEE, 1730 Mas¬ 
sachusetts Ave. NW, Washington, DC 
20036-1903; (202) 371-0101. 

g^, ICDCS-7, Seventh International Confer- 
7*7 ence on Distributed Computing Systems, 
September 21-25, Berlin. Contact R. Popescu- 
Zeletin, Hahn-Meitner-Institut Berlin, 
Glienicker Strasse 100, D-1000, Berlin 39, 
West Germany; phone 49 (30) 8009-2594 or 
49 (30) 8009-2541. 

gN International Conference on Software 
'7*7 Engineering for Real-Time Systems, Sep¬ 
tember 28-30, Cirencester, England. Contact 
R. Larry, Institute of Electronic and Radio 
Engineers, 99 Gower St., London WC1E 
6AZ, England UK. 


October 1987 


g^i 1987 Workshop on Computer Architec- 
7*7 ture for Pattern Analysis and Machine 
Intelligence, October 5-7, Seattle. Contact 
Steve Tanimoto, Dept, of Computer Science, 
FR-35, University of Washington, Seattle, 

WA 98195; (206) 543-1695. 

gN 12th Conference on Local Computer Net- 
7*7 works, October 5-7, Minneapolis, Min¬ 
nesota. Contact Stephane Johnson, Start, 
Inc., 10301 Toledo Ave. South, Bloomington, 
MN 55437; (612) 831-2122. 

gN ASPLOS-II, Second International Con- 
7*7 ference on Architectural Support for Pro¬ 
gramming Languages and Operating Systems 
(Computer Society, ACM), October 5-8, Palo 
Alto, California. Contact Randy H. Katz, 
Computer Sciences Division, UC Berkeley, 
Evans Hall, Berkeley, CA 94720 or Martin 


® ICCD-87, IEEE International Conference 
on Computer Design: VLSI in Computers 
and Processors, October 5-8, Rye Brook, New 
York. Contact Prathima Agrawal, AT&T Bell 
Laboratories, 600 Mountain Ave., Rm. 
3D-480, Murray Hill, NJ 07974; (201) 
582-6943. 

g^ Compsac 87 (Computer Society, IPSJ), 
7*7 October 5-9, Tokyo. Contact Tosiyasu 
L. Kunii, c/o Business Center for Academic 
Societies Japan, Yamazaki Bldg. 4F, 2-40-14, 
Hongo, Bunkyo-ku, Tokyo 113, Japan; phone 
81 (3) 817-5831, or Albert K. Hawkes, Sargent 
& Lundy, Engineering Consultants, 55 E. 
Monroe, Chicago, IL 60603; (312) 269-3640, 
or Stephen S. Yau, Northwestern University, 
Dept, of Electrical Engineering and Com¬ 
puter Science, Evanston, IL 60201; (312) 
491-3641. 

giv FOCS-87, October 12-16, Los Angeles. 
7^ Contact Ashok Chandra, IBM T.J. 
Watson Research Center, PO Box 218, York- 
town Heights, NY 10598; (914) 945-1752. 

g^\ Third Annual Expert Systems in Govern- 
7*7 ment Conference (Computer Society, 
AIAA), October 19-23, Washington, DC. 
Contact Peter Bonasso, Mitre Washington AI 
Center, 7725 Colshire Blvd., MS W952, 
McLean, VA 22102; (703) 883-6908. 

® FJCC-87, Fall Joint Computer Confer¬ 
ence, October 25-29, Dallas. Contact 
Computer Society of the IEEE, 1730 Mas¬ 
sachusetts Ave. NW, Washington, DC 
20036-1903; (202) 371-0101. 


November 1987 


21st Annual Asilomar Conference on Signals, 
Systems, and Computers (IEEE, Naval Post¬ 
graduate School), November 2-4, Pacific 
Grove, California. Contact Douglas F. 

Elliott, Rockwell International Corp., 3370 
Miraloma Ave., MS BD07, Anaheim, CA 
92803-3170. 

® Workshop on Workstation Operating Sys¬ 
tems, November 5-6, Cambridge, Mas¬ 
sachusetts. Contact Joseph Boykin, Custom 
Software Systems, PO Box 678, Natick, MA 
01760; (617) 653-2555. 


igN ICCAD-87, IEEE International Confer- 
7*7 ence on Computer-Aided Design, Novem¬ 
ber 9-12, Santa Clara, California. Contact 
Basant Chawla, AT&T Bell Laboratories, 

1247 S. Cedar Crest Blvd., Allentown, PA 
18103; (215) 770-3484. 

gN International Conference on Information 
7*7 Science and Engineering (Computer Soci¬ 
ety, I ERE), November 25-27, York, England. 
Contact R. Larry, Institute of Electronic and 
Radio Engineers, 99 Gower St., London 
WC1E 6AZ, England, UK. 
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ACM/IEEE 24th 

DESIGN AUTOMATION CONFERENCE® 


THE WORLD’S 
PREMIER CAD/CAM 
CONFERENCE 



DON’T MISS THE 
TUESDAY BEACH PARTY! 


BE A WINNER 
SIGN UP FOR DAC TODAY! 


The Design Automation Conference has become the event of 
the year for the electronic CAD/CAM engineer. In this 24th year 
DAC will offer: 

• Over 130 papers, tutorials and workshops 

• Over 100 vendors of CAD hardware and software will 
exhibit their products. 

• Over 50 vendor technical presentations on Sunday 
afternoon 

YES! THE BEST IS GETTING BETTER. 


© 


SIG DA 


Miami Beach Convention Center, Miami Beach, FL 
JUNE 28-JULY 1, 1987 
Advance Registration ends May 30, 1987!!! 


HOTEL I 

24th Design Automation 

June 28-July 1, 1987 
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“YOU 

CANT 

DO 

THAT 


B uild a large scale mainframe computer 
that will outperform the competition’s 
leading model? 

“IMPOSSIBLE!” they said. 

But Amdahl did it back in the early 1970’s. 
And today we are a leader in the develop¬ 
ment, manufacturing, marketing and sup¬ 
port of general purpose and scientific 
computer systems, storage products, 
communications systems and software. 

In less than two decades we have grown 
from 5 to more than 7,000 “can do” 
employees around the globe. Our suc¬ 
cess is a result of teamwork, innovation 
and commitment to achieve the impossi¬ 
ble. If you are ready for challenge, creativ¬ 
ity and growth, explore your opportuni¬ 
ties with Amdahl in one of the following 
developmental areas. 

COMMUNICATIONS 

YOU CAN join an entirely new project 
to bring program compatibility and 
interoperability to the SNA and OSI 
environments. We’re using some of the 
most sophisticated software tools to do it. 
UNIX* System V Release 3 STREAMS for 
one. Our own mainframe UNIX* time¬ 
sharing system —UTS**— for another. 
We need experienced, dedicated people 
to join our project in the following areas: 

• OPERATING SYSTEM DEVELOPMENT 
& SUPPORT ENGINEERS 
•SOFTWARE DESIGNERS 
& DEVELOPERS 

•COMMUNICATIONS DEVELOPERS 
• NETWORK PROTOCOL ARCHITECTS 
• NETWORK MANAGEMENT 
ARCHITECTS 


OPERATING SYSTEMS 

YOU CAN develop a new, large main¬ 
frame operating system... self-tuning and 
does not require system generation. Use 
your 3+ years’ operating system develop¬ 
ment experience to join us in one of the 
following key areas: 

•SECURE SYSTEMS 
•TRANSACTION PROCESSING 
•NETWORKING 

•OPERATING SYSTEM DEVELOPMENT 

UNIX* 

YOU CAN “make” the kernel in 3 
minutes! Use you 3+ years’ development 
experience in UNIX* and/or 370 operat¬ 
ing system to join our UTS** develop¬ 
ment team in the following positions: 

• OPERATING SYSTEM DEVELOPERS 
•TOOLS DEVELOPERS 

• NETWORK SOFTWARE DEVELOPERS 
•OPERATING SYSTEM 

VALIDATION ENGINEERS 


YOU CAN enjoy the benefits and com¬ 
petitive salary you would expect from an 
industry leader. To apply, send your 
resume to: M. Montgomery, Amdahl 
Corporation, Employment Department, 
Dept. 5-1, P.O. Box 3470, M/S 300, Sunny¬ 
vale, California 94088-3470. Principals 
only please. 


YOU CAN AT 

sundahl 


•UNIX is a registered trademark of AT&T 
•'UTS is a trademark of Amdahl Corporation 

Amdahl Corporation is proud to be an 
equal opportunity employer through 
affirmative action. 











